Adaptive Feature Specific Spectroscopy for Rapid Chemical Identification References and Links

Spectroscopic chemical classification based on adaptive, feature-specific measurements has been implemented and demonstrated to provide significant performance gain over traditional systems. The measurement scheme and the decision model are discussed. A prototype system with a digital micro-mirror device as the adaptive element has been constructed and validates the theoretical findings and simulation results. Detection technologies for chemical warfare agents and toxic vapors (CRC, 2005). 2. W. Pearman and A. Fountain, " Classification of chemical and biological warfare agent simulants by surface-enhanced Raman spectroscopy and multivariate statistical techniques, " J. Detection and identification of explosive RDX by THz diffuse reflection spectroscopy, " J. Terahertz spectroscopy and imaging for defense and security applications, " Proc.scopic method for identification of clinically relevant microorganisms growing on solid culture medium, " Anal.cation of medically relevant microorganisms by vibrational spectroscopy, " J. Miniaturized total chemical analysis systems: a novel concept for chemical sensing, " Sens. A CFAR adaptive matched filter detector, " IEEE Aerosp. Optical moving target detection with 3-D matched filtering, " IEEE Aerosp. Sequential analysis with more than two alternative hypotheses, and its relation to discriminant function analysis, " J.


Introduction
Spectroscopy-based chemical classification has a number of critical applications-for example, defense [1,2], security [3,4], and medicine [5,6].Traditional spectroscopic approaches, however, struggle in low signal situations where a small number of signal photons are apportioned across many detectors.Many of these critical applications fall in this low signal category due to the weak analyte signatures associated with such tasks.There is also frequently a constant need for rapid identifications due to the serious repercussions associated with these application areas, which further limits signal acquisition time.
In this manuscript, an alternate spectrometer architecture is discussed in detail.This novel architecture, which we call the adaptive feature specific spectrometer (AFSS), overcomes the limitations posed by the traditional approaches.Unlike the traditional systems where measurements are made by sampling across each spectral channel, the AFSS makes use of feature-based measurements wherein the feature vectors are adaptively reconfigured based on the information gathered from each measurement.The decision model used by this alternate design is based on sequential hypothesis testing-a multiple measurement framework, which constantly monitors the quality of information obtained after each measurement.Early simulation results have shown significant performance gains over the traditional spectrometers in the low SNR regions.An experimental prototype was constructed to validate the findings of our simulations.The experimental setup, assumptions and results are presented and discussed below.
It should be noted that what we propose here is a feature-based measurement architecture to improve the classification efficiency of spectrometers in the low SNR regime.Specific design choices for the AFSS were made at a practical level in order to test our simulations with a working laboratory system.This architecture can be applied to any context to which one might apply traditional spectroscopy.We believe the demonstrated performance gains to be qualitatively independent of choices such as wavelength range, spectral resolution, decision framework, error thresholds, and spectral library content.

Feature specific spectroscopy
A variety of chemical classification schemes already exist [7,8], many of them based on optical spectroscopy [9,10].In traditional spectroscopy, the spectral channels are sampled individually and a noise contribution is associated with each of the channels.The traditional measurement vector m t is therefore mathematically represented as: where s is a length-p column vector corresponding to the incoming spectrum to be classified and n t denotes a length-p column vector where each element corresponds to the noise contribution associated with the individual spectral channels (assumed below to be zero-mean AWGN with standard deviation σ n ).The primary task of spectral classification involves the determination of the best match for the spectrum s from a given spectral library S. S is a known p × r library with r the number of spectra present.Feature specific measurements have proven to be a very good alternative to traditional measurement schemes, especially in the imaging domain where considerable performance gain has been reported [11,12].These are multiplexed measurements [13] wherein the incoming spectrum s is optically projected onto an arbitrary feature vector before making the measurement.The feature specific measurement m p is given by: where P is a q × p projection matrix consisting of q length-p feature vectors.Here, n p is a length-q column vector corresponding to the noise (again AWGN with standard deviation σ n ).
Since the projection of the incoming spectrum onto an arbitrary basis vector may be viewed as a linear combination of the individual spectral components, the overall signal strength of the measurement is increased relative to the noise contribution.This multiplexing improves the measurement SNR considerably when compared with the traditional measurements.

Detection frameworks-Sequential hypothesis testing
Matched-filter based detection frameworks [14,15], where the unknown spectrum is correlated with a given spectral library S and the largest result is identified as the best match, are the simplest means of classifying spectra.Matched filter concepts prove challenging in low SNR applications, however.In such scenarios, a single measurement is frequently not enough to decide in favor of a definite hypothesis.In addition, they are sub-optimal when considering more than two hypotheses.A number of static multiple-measurement frameworks based on Neyman-Pearson [16,17] or Bayesian [18] criteria exist in which the number of measurements is decided a priori.However, such techniques have some disadvantages-due to the fixed number of measurements, the process may terminate prior to reaching a desired confidence, or may waste time taking additional measurements even though the desired confidence level has been reached.Sequential hypothesis testing (SHT) [19] is a technique which overcomes these limitations by constantly monitoring the quality of information obtained after each measurement.The likelihood ratios of the competing hypotheses are evaluated after each measurement and are compared with decision thresholds set by the user.The decision thresholds may be formulated using standard Neyman-Pearson or Bayesian criteria using the desired error tolerances.Wald initially proposed the concepts of SHT for binary hypotheses.This was later extended to multiple hypotheses by Armitage [20].This is described below.
Considering two hypotheses H 0 and H 1 , the conditional probabilities of the hypotheses given a series of k feature-specific measurements {m p } k may be expressed in terms of the likelihoods using Bayes' theorem as: The likelihood of the i th hypothesis is given by L i,k which is formulated as follows: The ratio of the two conditional probabilities discussed may be expressed in terms of likelihood ratio Λ and the prior probabilities of the two hypotheses.The probability ratio may be written as: If the measurement at each iteration is given by {m p } k , a very generalized update procedure for the likelihood ratios Λ k may be designed.The likelihood ratio at the k th step may be written as: as the likelihoods are the conditional probabilities which may be updated as follows: Standard Neyman-Pearson criteria may be used to determine the decision thresholds Θ 0 and Θ 1 and the decision may then be made in the following fashion: Do not make a decision and make another measurement if Θ 1 ≤ Λ k ≤ Θ 0 .

Extension of SHT to multiple hypotheses
For the case of w hypotheses where w > 2, the likelihood ratios may be updated as: where Λ i, j;k now is the i, j th element of a w × w likelihood ratio matrix Λ.The elements of the Λ matrix are the pairwise likelihood ratios and represent the probability ratios Pr H i |{m p } k /Pr H j |{m p } k .All the elements in a single row (except the diagonal elements) are compared with the respective thresholds to decide in favor of a particular hypothesis.The decision making process may be formulated as follows: Decide in favor of H i if Λ i, j;k > Θ i ∀ i = j Do not make a decision and make another measurement otherwise.
It should also be noted that Λ i, j;k = 1 Λ j,i;k by construction which ensures that only one hypothesis satisfies the test at a particular instance.

Adaptivity
A number of ad hoc techniques exist which may be used for synthesizing the feature vectors so that increased discrimination between the spectral projections is achieved.Principal component analysis [21] proves to be a reasonable choice in this regard, as the first principal component captures the direction of greatest variance among the competing spectra, the second component provides the direction of greatest remaining variance and so on.However, it is important to note that feature decomposition based on principal component analysis is ad hoc in nature.It is possible to find a set of k features which are more discriminatory than the first k principal components.The current AFSS system utilizes feature vectors synthesized via principal component analysis; we are working on techniques for determining optimal feature vectors.
In the traditional implementation of the sequential hypothesis testing, the probabilistic information gained after each measurement is not used.We can improve performance by using this information to adapt the features as the measurement process proceeds.If the b th spectrum in the spectral library S is denoted by S b , then the mean spectrum S is given by: The first q principal components are defined as the q eigenvectors of the signal covariance matrix C corresponding to the q largest eigenvalues given by: In order to implement the probabilistic information gained after each measurement, the likelihood ratios are used to evaluate the probability estimate of a hypothesis H j given a series of k measurements {m p } k : where Λ i, j;k is the i, j th element of the likelihood ratio matrix Λ k .The i, j th element of the likelihood ratio matrix is given by: Essentially, all the elements in a single column have been summed and normalized to determine the denominator in the elements of the particular column.
The probability estimates may be used to determine the probability weighted covariance matrix Q k also known as the inter-class scatter matrix, with the mean spectrum S now the probabilistically-weighted mean given by: The q eigenvectors of the inter-class scatter matrix [22] corresponding to the q largest eigenvalues are used as the feature vectors of our choice.The projection of the spectral library onto these feature vectors yields improved discrimination among the hypotheses which are probabilistically still in serious contention.Figure 1 shows a block diagram illustrating the step by step procedure of the measurement and detection scheme involved in the adaptive feature specific spectrometer.

Simulation results
We characterize the difficulty of a classification task using a metric we call task SNR (TSNR).TSNR is defined as the ratio of the class separation of the spectral library to the standard deviation of noise, where the class separation is represented by the minimum pairwise Euclidean distance between spectra in the library: Assuming the noise contribution to be AWGN with standard deviation σ n , the task SNR is At each desired TSNR, 500 simulations were performed and a new random library of class size 5 was chosen from a set of 200 different Raman pharmaceuticals spectra.It was assumed that a 1% chance of a false positive and 1% chance of a miss are acceptable.A traditional system was simulated along with a static feature specific system and an AFSS system.The features were synthesized from the eigenvectors of the inter-class scatter matrix (e.g.probabilisticallyweighted PCA).For the purpose of all our simulations and experiments, we assumed q = 1 i.e. we always work with the first principal component.In order to physically realize the feature vector which contains both positive and negative weights, we decomposed the features into two separate components-one consisting of just the positive weights and the other just the negative weights.This dual rail implementation has some noise implications due to the presence of two individual noise contributions.The measurements m + and m − corresponding to these two vectors are given by: where p + and p − are the decomposed feature vectors, n 1 and n 2 are the two decomposed noise vectors, and s is the incoming spectrum.The actual measurement is obtained by finding the difference m + − m − which will be associated with a noise contribution √ 2σ n .The positive and negative elements of the feature vectors were also scaled to 1 and -1 respectively in order to accurately reflect the operation of a non-grayscale DMD.This binary scaling introduces the following trade-off: the implemented projections capture more light (each channel is ultimately collected at full strength), but the projection direction is potentially modified to some extent from the direction of the principal component.More detailed analysis of this tradeoff will be addressed in a future publication.
Figure 2 shows the simulation results.The traditional system behaves in the expected manner-the average time to classification increases with a decrease in the task SNR.In fact, for every 10 dB decrease in the task SNR, there is 10× increase in the average time to detection.The performance of a static feature-based system is also highlighted wherein a constant feature synthesized prior to the first measurement is repeatedly used.Such static feature-based systems definitely improve on the traditional systems in the low TSNR regions due to the multiplexing advantage gained.However, the relative performance gain is not very dramatic.The performance of the AFSS system, on the other hand, is very encouraging.At low TSNR, we can clearly observe that the average time to classification falls by a factor of close to 150 when compared with the traditional systems.At -50 dB, the AFSS system needs approximately 700 measurements to make a classification when compared with the 10 5 measurements required by the traditional systems.The performance curves of the traditional and the AFSS system cross each other near the 10 dB point.In the high TSNR region (10 dB -50 dB), we can clearly see that the AFSS system is worse by a factor of 2 when compared with the traditional systems.This is a result of the dual rail implementation of the feature vectors.In spite of the noise implications of the dual rail implementation, the AFSS easily outperforms its traditional counterpart in the low TSNR regions.It is reasonable to consider the impact of the chosen error rate on the performance of the AFSS.However, as both the AFSS and traditional system utilize the same decision framework, the impact is expected to be similar in both cases, leaving the performance gain of the AFSS qualitatively unchanged.This belief is supported by initial investigations with other error rates.

Experimental implementation
The optical design of the AFSS prototype is based on a traditional dispersive spectrometer where the linear detector array is replaced by an adaptive element which multiplexes the required optical signals on to a large area photoreceiver.The AFSS prototype is designed to operate in a wavelength range of 470-630 nm. Figure 3 shows a schematic of the AFSS system.Zemax was used to optimize the optical design and a prescription showing the specifications of the optical components involved in the AFSS prototype design are tabulated in Table 1.
Figure 4 illustrates a block diagram showing the basic functional blocks constituting the AFSS prototype.The central component of the prototype is the adaptive spectral filter which aids in multiplexing certain spectral bands and directing them on to a single element photodetector.The adaptive spectral filter of our choice is a digital micro-mirror device (DMD) which has been extracted from a Pico Projector Development Kit from Texas Instruments.
The active mirror array in our DMD measures 2.4 mm × 3.6 mm.The individual mirrors on the DMD measure 7.56 μm × 7.56 μm and can tilt by approximately 12 • .The DMD has an efficiency of 68% over a wavelength range of 420-700 nm.The individual mirrors on the DMD Fig. 3. Schematic illustrating the working of an AFSS system and the optics involved in its design.The DMD has a resolution of 480 × 320 (HVGA resolution).The DMD chipset [23] consists of a DPP1505 processor which performs the HVGA DMD data formatting and control.An ultra low power 16 bit RISC mixed signal microcontroller is also part of the chipset.The RGB LED array which acts as the light source for the projector has been replaced with a bypass circuit to enable the elimination of any stray light and protect the chipset against any excessive heat from the LED array.The mirrors are controlled using a Beagleboard, a low power embedded computer based on an ARM Cortex A8 processor.
The Beagleboard engages in bi-directional communication with the computer via TCP/IP sockets.The feature vectors are synthesized by the computer before being transmitted to the Beagleboard which performs the data and image processing and transmits the appropriate patterns to the DMD.The patterns transmitted to the DMD have a VGA resolution (640 × 480) as the DMD chipset is designed to accept only VGA input.The VGA-HVGA down-sampling is handled by the processor in the DMD chipset so that the appropriate mirrors are controlled.
A New Focus 2031 large area photoreceiver is used to collect the multiplexed spectral signals and a National Instruments USB6259 DAQ board is utilized for the data acquisition.The digitized signals are then recorded by the computer for further processing and analysis.New feature vectors are generated if the evaluated likelihood ratios are still in the intermediate regions with regard to the decision thresholds.The 3D modelling for the different optical mounts and the system enclosure was done using Solidworks.Figure 5 illustrates the Solidworks model of the AFSS optical device and Fig. 6 is a snapshot of the optical system with all the optical components mounted.

Calibration and library design
The experiment also includes some additional hardware which is of great importance for calibration and signal recovery.A detailed functional block diagram of the AFSS system along with this additional hardware is shown in Fig. 7.A photograph of our entire system along with all the additional hardware is also shown in Fig. 8.
To provide flexibility and adjustability, a spectral library consisting of the spectral information of different combinations of red, green and blue LEDs in an RGB LED array [24] was designed and used in the experimental trials.The LED array allowed us to present the system with a much wider range of source spectra than is tractable with physical samples.A dimmer circuit was designed to vary the intensity of the individual LEDs.Due to the different spatial location of the individual LEDs in the array, an integrating sphere was used to uniformly mix the light from the three LEDs.A sample 5 class LED library which was used in our experi-    ments is depicted in Fig. 9. Unfortunately, the signal strength of the LED source spectrum is very weak due to the broadband nature of the spectrum and the transmission loss associated with the integrating sphere.

Improving the system SNR
In order to calibrate the AFSS system, it is necessary to determine the response of the system to each element of the feature vector where each element corresponds to a column of mirrors on the DMD.Working with a length-640 feature vector makes sense as the DMD is designed to accept only VGA inputs.Individual rows of a calibration matrix C (a 640 × 640 identity matrix) are transmitted to the DMD and measurements (M e ) are recorded for each case which are given by: where S e is the spectrum from the LED array.In order to overcome the limitations posed by the signal strength of the source spectrum, we considered blocks of columns to improve the overall system SNR.A block size of 4 proved to be ideal for the purpose of the experimental validation.A better system SNR was achieved by using S-matrix based signal reconstruction techniques for calibrating the AFSS system.S-matrices with very low condition numbers were used to make multiplexed measurements which provide significant SNR advantage.The measurements using the S-matrix based calibration scheme is given by where M the measurement vector, C corresponds to the new calibration matrix which is the S-matrix of appropriate size, S e is the incoming LED spectrum and n e corresponds to the noise contribution.The multiplexed measurements may then be used to synthesize the LED spectrum by accurate reconstruction given by Ŝe where Ŝe corresponds to the measured LED spectrum and C −1 is the inverted calibration matrix.The noise contribution is very negligible.An optical chopper is used alongside a lock-in amplifier to further amplify and recover the weak optical signals.The optical chopper used in our experiment is a SR540 from Stanford Research Systems.The reference frequency is set at 800 Hz and is fed to a SR810 lock-in amplifier again from Stanford Research Systems.The settings of the lock-in amplifier are controlled using Matlab on the computer through the NI GPIB controller board.It is important to note that this additional hardware is not an essential part of the AFSS system but is just used to improve the system SNR for this particular proof of principle experiment.

Noise settings and experimental results
A master library consisting of 15 different spectra corresponding to different combinations of the red, green and blue LEDs was designed.A five class spectral library was chosen at random for each experimental trial with a random spectrum from this smaller library chosen as the source spectrum.Acceptable error rates were set in the same manner as in the simulations-1% chance of false positive and 1% chance of a miss.
The feature vectors were synthesized and the positive weights and the negative weights were scaled to 1 and -1 respectively.In order to implement the feature vectors physically using the DMD, the projection vectors were decomposed into two-one with just the positive weights and one with just the negative weights.The measurements corresponding to the two complimentary vectors were then recorded using the DAQ and the difference between the two measurements was evaluated to continue with the decision making process.The noise implications of the dual rail implementation of the feature vectors were already discussed in Section 1.As with the simulations, 500 experimental trials were performed at each desired TSNR level.To vary the effective TSNR, we injected noise into the AFSS system by varying the quality of measurements sampled by the DAQ.The DAQ is ideally configured to sample the measurements at 1 KHz.The noise contribution may be varied by changing the time of data acquisition of the DAQ.A random feature pattern was transmitted to the DMD and the number of samples per trigger was studied as a function of the standard deviation of the measurements over a predefined period of time.Figure 10 illustrates this plot on a logarithmic scale along with a linear fit to the curve.The linear fit may be characterized by the following equation: where x is the samples per trigger and y is the noise parameter.Once, the desired task SNR is set, the necessary standard deviation of noise may be evaluated using the class separation of the spectral library as follows: where σ n is the noise standard deviation and σ l is the class separation.The number of samples which needs to be acquired per trigger may be then set on the DAQ using the following equation: The experiments were completed using the above scheme and the results were used to validate our theoretical findings.Figure 11 shows the performance of the AFSS system when compared with a traditional system.It is clearly evident that the experimental results match very closely with simulation.However, the observed performance gain of the AFSS on the LED spectra is reduced to ≈ 15× compared to ≈ 150× with pharmaceutical spectra.Two significant differences exist between the two cases.First, the dimensionality of the spectra in the library is different (1300 spectral channels for the pharmaceuticals, while only 159 for the LEDs as a result of limitations imposed by the size of the available mirror array and strength of the LED source).Second, the details of the spectra are rather different, with the LED spectra being slowly-varying and the pharmaceutical spectra having many sharp peaks.To investigate what was behind the difference in performance gain, we downsampled the pharmaceutical library to 159 spectral channels and re-ran the simulation.Figure 12 compares the performance of the LED experiment and the new simulation with downsampled pharmaceutical spectra.The close agreement between the curves suggests strongly that the performance gain is dominated by the dimensionality of the spectra and not their specific details.To further explore this issue, we performed a series of simulations where the performance gain of the AFSS at a TSNR of -30dB was determined using downsampled pharmaceutical libraries of varying lengths.Figure 13 shows the result-the performance gain achieved by the AFSS decreases approximately linearly with the decrease in the dimension of the spectral library.

Future work and conclusion
We have discussed a novel chemical detection scheme based on adaptive feature specific spectroscopy.Simulation results with regard to a pharmaceuticals library illustrated that AFSS systems perform dramatically better than traditional systems.Using a digital micro-mirror device as the primary adaptive element, an AFSS system was designed and used to validate our theoretical findings.The experimental results using a custom LED spectral library matched the results of our simulations very closely.The results clearly stress the benefits of such an adaptive system with regard to classification time in extremely critical application areas like defense, security and medicine.
There is always a possibility of finding a set of unique feature vectors which are more discriminatory in nature when compared with the ones synthesized using principal component analysis (it has already been stated that feature vectors based on principal components are essentially ad hoc in nature).Further research is being carried on in this regard to determine the globally optimal feature vectors.Information optimal features have already been designed and implemented in the imaging domain where they have proven to perform better than the adaptive feature specific schemes in the low task SNR regions [25].
The variation of the performance of the AFSS system as a function of the size of the library is currently being studied.Such spectroscopic techniques may also be made to maximum use by utilizing such setups to perform operations like concentration estimation and other parameter estimation with respect to the chemicals being investigated.

Fig. 1 .
Fig. 1.The block diagram illustrates the measurement and decision framework in an AFSS system.The knowledge gained after each measurement is fully used by adaptively reconfiguring the feature vectors.

#Fig. 2 .
Fig.2.The plot compares the performance of the AFSS with a static feature specific spectrometer and a traditional spectrometer.

Fig. 4 .
Fig. 4. The important components of an AFSS system are shown in this block diagram.

Fig. 5 .
Fig. 5. 3D model showing the optics involved in the AFSS design, the spectral filter and the mounts.

Fig. 7 .
Fig.7.A block diagram illustrating the different functional elements involved in the AFSS prototype along with all the additional hardware which aids in improving the system SNR.

Fig. 8 .
Fig. 8.A snapshot showing the AFSS prototype along with the optical chopper and the lock-in amplifier.

Fig. 10 .
Fig. 10.A plot showing the variation of the samples per trigger acquired by the data acquisition board with the noise standard deviation.

Fig. 11 .
Fig. 11.A plot validating the performance of the AFSS.The experimental results match very closely with the simulation results and is shown to perform better than the traditional systems at the low task SNR regions.

#
Fig. 12.The performance curves for a broadband LED spectral library and a down-sampled version of the Raman spectral pharmaceuticals library.

Fig. 13 .
Fig. 13.A plot showing the linear variation of the performance gain of the AFSS over its traditional counterpart with the dimensionality of the spectral library under consideration. )