High-precision sensor for glucose solution using active multidimensional feature THz spectroscopy

Terahertz waves are known for their bio-safety and spectral fingerprinting features, and terahertz spectroscopy technology holds great potential for both qualitative and quantitative identification in the biomedical field. There has been a substantial amount of research utilizing this technology in conjunction with machine learning algorithms for substance identification. However, due to the strong absorption of water for terahertz waves, the single-dimensional features of the sample become indistinct, thereby diminishing the efficiency of the algorithmic recognition. Building upon this, we propose a method that employs terahertz time-domain spectroscopy (THz-TDS) in conjunction with multidimensional feature spectrum identification for the detection of blood sugar and glucose mixtures. Our research indicates that combining THz-TDS with multidimensional feature spectrum and linear discriminant analysis (LDA) algorithms can effectively identify glucose concentrations and detect adulteration. By integrating the multidimensional feature spectrum, the identification success rate increased from 68.9% to 96.0%. This method offers an economical, rapid, and safe alternative to traditional methods and can be applied in blood sugar monitoring, sweetness assessment, and food safety.


Introduction
Blood glucose stands as one of the paramount indicators of human health, with abnormal blood sugar levels contributing to various issues such as obesity [1] and diabetes [2].Current approaches to blood glucose detection encompass three categories: invasive, minimally invasive, and non-invasive methods [3][4][5].Invasive methods, commonly involving venous or arterial blood sampling, pose a risk of infection and are unsuitable for continuous glucose monitoring [6].Minimally invasive methods, exemplified by subcutaneous implantation of biosensor, measure glucose concentration in the interstitial fluid or subcutaneous tissue, allowing for continuous glucose monitoring [7].Li et al. utilized a novel serial quadruple tapered structure-based localized surface plasmon resonance (LSPR) sensor to achieve precise detection of glucose levels in the human body.The sensor demonstrated stable and reliable detection performance [8].However, these methods, particularly glucose biosensors, are susceptible to rapid performance deterioration post-implantation due to issues such as surface fouling and coagulation arising from inadequate biocompatibility [9].Non-invasive techniques, primarily employing optical methods like Raman spectroscopy [10], thermal emission spectroscopy [11], and photoacoustic spectroscopy (PA) [12].Numerous attempts have been made to measure glucose in biological tissues using these techniques, and some correlation between the measured optical signals and blood glucose has been observed.However, as noted by various authors, none have definitively proven that the measured signals correspond directly to the actual blood glucose concentration [3].Despite these challenges, non-invasive methods offer a promising direction for blood glucose detection, potentially mitigating the risks associated with invasive procedures and continuous monitoring.
Terahertz waves possess spectral fingerprinting features, penetrability, stability, and biosafety [13][14][15][16].Based on this, Terahertz Time-Domain Spectroscopy (TDS) systems have been employed to analyze various materials, including genetically modified organisms [17], organic substance [18] and biomedical field [19,20].Presently, SONG Chao [21] et al. have demonstrated that infrared spectroscopy struggles with discriminating molecules with minimal structural differences, thereby limiting glucose solution analysis and identification.However, references exist for terahertz time-domain spectroscopic determination of glucose, such as the work [22] of Fischer et al., who measured time-domain spectra of α-D-glucose and β-glucose in solid state.Given variations in hydrogen bond counts within molecules leading to disparate hydrogen bond network strengths and resulting molecular motion intensities, distinctions in terahertz wave absorption arise [23].Furthermore, drawing upon prior research, a myriad of techniques, encompassing optical, electrical, and magnetic fields, has been employed for the manipulation of THz devices.Successful endeavors by researchers have resulted in the development of electrically controlled THz modulators, achieved through the synergistic integration of high electron mobility and a near-zero bandgap.These modulators exhibit remarkable modulation depth and expansive modulation capabilities across various applied voltages [24].Additionally, an exploration of the optical properties of two-dimensional materials has been conducted, utilizing external light fields generated by 1064 nm lasers [25].Therefore, employing external controlled (e.g., light controlled) in conjunction with terahertz time-domain spectroscopy appears to be a viable approach for analyzing glucose solutions.
Consequently, we propose a multidimensional terahertz feature spectroscopy method, wherein terahertz feature spectrum is acquired through the manipulation of distinct external controls, thereby enhancing the dimensionality of the dataset.The utilization of multidimensional data, in contrast to unidimensional datasets, offers a more exhaustive understanding of specimens, thereby enhancing the efficiency of substance identification.Within the ambit of this study, we propose a method for discriminating glucose solutions and mixtures.This involves employing THz-TDS to analyze the transmission spectrum of solution concentrations based on multidimensional terahertz feature spectroscopy.In the context of terahertz data analysis, various algorithms are available, and the choice depends on the nature of the data, research objectives, and resource availability.Linear Discriminant Analysis (LDA) [26][27][28] is commonly used for classification and dimensionality reduction tasks, aiming to maximize differentiation between different categories.Other commonly used algorithms in terahertz data analysis include Principal Component Analysis (PCA) [29] for dimensionality reduction.Such as Principal component analysis of terahertz spectrum on hemagglutinin protein and its antibody [30].Support Vector Machines (SVM) for classification of Chinese traditional medicine [31] and antibiotics [32].And Random Forest (RF) as an ensemble method capable of robustly handling complex data [33,34].Through a comparative analysis of multiple algorithms, considering the requirements of data classification and dimensionality reduction, as well as the data complexity, we ultimately selected the LDA algorithm for terahertz data processing.The findings underscore that the amalgamation of multidimensional terahertz feature spectrum and LDA constitutes an effective and precise approach for the identification of solution concentrations.

Experimental details and discussion
The THz-Time Domain Spectrometer (THz-TDS) utilized in the experiment is represented in Fig. 1(a).The femtosecond laser employed has a central wavelength of 780 nm, a pulse width of 100 fs, and a repetition frequency of 83 MHz.The laser output power us approximately 15 mW.The light-controlled method is introduced in this experiment, as depicted in Fig. 1(b) which illustrates the light controlled, where a light controlled laser beam (The wavelength is 808 nm, the power range is 0-10 W) with adjustable power is directed onto the sample at a 45-degree angle in conjunction with a vertically incident terahertz wave.It is noteworthy that an overly large incident angle can result in an expanded photo-induced spot in the light-controlled, leading to consequent loss of laser energy.Conversely, an excessively small incident angle may introduce interference between the control light path and the experimental optical path.The process for fabricating glucose thin film samples on a glass substrate is as follows: Initially, a certain mass of anhydrous glucose particles is filled with 100 mL of deionized water.The solution is then sonicated and stirred using ultrasound to prepare glucose solutions of varying concentrations.As depicted in the diagram, the experiment involves creating solutions with concentrations of 0.1 g/100 mL (representing normal blood glucose levels in humans), 0.5 g/100 mL (complying with sugar-free standards for beverages), 2.5 g/100 mL and 10.0 g/100 mL (typical beverage sweetness levels).Additionally, a mixed solution sample is designed, consisting of 5.0 g of glucose particles, 5.0 g of erythritol particles, and 100 mL of deionized water.Using a dropper, the solution is evenly distributed onto a glass slide, and a cover slip is gently placed over it for a 10-minute settling period, resulting in the formation of glucose thin film samples.Ten samples of each solution type are prepared in the experiment, totaling 50 samples.During the collection of spectral data, the samples are positioned between the detecting and transmitting antennas, with terahertz waves incident perpendicular to the sample surface to obtain transmission spectra of the device samples.Each sample is measured three times, and the averages are calculated.Ultimately, time-domain spectra for all 50 samples are acquired.Throughout all experiments, device characterization is conducted under conditions of 25 °C and 40% humidity.
The amplitude and phase of the samples in the THz frequency range can be determined through the application of Fast Fourier Transform (FFT).The calculation formula used in this process is provided as follows [35][36][37][38]: where n(ω) represents the real part of the refractive index; φ(ω) denotes the phase difference; c is the propagation speed of the THz wave in a vacuum; ω is the frequency; and α(ω) indicate the extinction coefficient and absorption coefficient; ρ(ω) in Eq. ( 2), (3) denotes the amplitude ratio.The concept of absorbance, which describes the degree of light absorption by a material in a more intuitive manner, is determined by the frequency spectra of the sample signal (E sample (ω)) and reference signal (E reference (ω)) as described by the following equation: Utilizing Time-Domain Spectroscopy (TDS), we conducted examinations on the prepared thin film samples.The time-domain spectra for five distinct specimens are depicted in Fig. 2(a).Subsequently, Fourier transformation was applied to the dataset, yielding spectral signals within the frequency domain spanning from 0 THz to 1.2 THz.From Fig. 2(b), it is evident that terahertz waves passing through the sample experience nearly complete attenuation of amplitude between 0.8 THz to 0.9 THz, rendering this frequency range ineffective for providing meaningful information.Consequently, for the analysis of sample characteristics, we selectively extract the effective frequency range from 0.1 THz to 0.9 THz.The absorption coefficient and transmittance within the range of 0.1 THz to 0.9 THz were calculated utilizing Equations (1-4).Figure 2(c) illustrates the absorption coefficient curve of the sample, revealing pronounced absorption in the high-frequency range, resulting in low transmittance.Notably, differences in frequency absorption peak positions and quantities are observed among different species of samples.Sample Mix exhibits two absorption peaks within the frequency range, attributed to variations in the absorption effects of the two substances in the mixed solution.Figure 2(d) depicts the transmittance curve of the sample, indicating that after 0.7 THz, the transmittance of the sample becomes almost negligible.This is attributed to the central frequency of the sample's terahertz spectrum falling within the range of 0.2-0.3THz.Beyond 0.7 THz, the sample exhibits strong absorption of terahertz waves, resulting in a significant reduction in transmittance.
The results of the experiment indicate that the samples exhibit distinct responses in the THz band.The refractive index and absorption coefficient curves of samples exhibit variations in the 0.1-0.9THz frequency range, which can be leveraged for initial identification of the samples.
To further analyze the THz spectrum, it is necessary to increase the complexity of the data by modifying the experimental conditions.With this in mind, the experimental conditions were altered through the application of a light-controlled laser beam (with powers of 0.5 W, 1.0 W, 1.5 W).The measured optical time-domain signals are displayed in Fig. 3.
Our results indicate that the applied optical power has an inverse relationship with the amplitude of the THz wave.The amplitude decreases as the optical power increases from 0 to 1.5 W. To provide a more intuitive representation of the effects brought about by light controlled, we calculated the modulation depth, which quantifies the percentage of amplitude variation [39].The calculation formula for the modulation depth is given as follows: A N is the amplitude of sample without light controlled and A L is the amplitude of sample with light controlled.
As shown in Fig. 3(e), upon increasing the incident optical power from 0 W to 0.5 W, the amplitude of the Blood glucose sample exhibited a reduction of 4.22%.Further elevation of the power to 1.0 W and 1.5 W led to a larger amplitude decrease of 11.69% and 23.45%.By comparison, under the same light controlled conditions, the Sugar-free sample demonstrated a reduction in amplitude of 12.24%, 23.53% and 37.57%.And the 2.5 g/100 mL glucose solution sample (Normal one) demonstrated a reduction in amplitude of 19.30%, 33.84%, 48.23%.As for the 10.0 g/100 mL glucose solution sample, a decrease in amplitude of 25.41%, 33.06% and 42.12% was observed at the corresponding power.Also, the Mix sample demonstrated a reduction in amplitude of 14.78%, 16.89% and 27.19%.This phenomenon arises from the presence of different hydrogen bonds in the glucose solution and the mixed solution.The interaction of molecular hydrogen bonds is enhanced upon optical excitation, resulting in an enhanced absorption of terahertz waves by the solution samples [40][41][42][43].With the increase in optical intensity, there is a corresponding decrease in terahertz wave transmittance, resulting in a reduction in the amplitude detected by the sensor.The findings suggest that under optical modulation, the terahertz signals of the samples exhibit alterations.Modulation depth differs among distinct substances under equivalent optical modulation conditions.At an optical power of 0.5 W, the sample with a concentration of 10.0 g/100 mL exhibits the maximum modulation depth.Meanwhile, at an optical power of 1.5 W, the sample with a concentration of 2.5 g/100 mL demonstrates the highest modulation depth.Consequently, it can be inferred that samples with varying concentrations or compositions display distinctive absorption and scattering behaviors when exposed to light, leading to variations in the absorption of terahertz waves.These differences in modulation depth contribute to an enriched pool of discriminative information.Following this, computational analyses were carried out to determine the samples' absorbance under controlled conditions, followed by the application of linear discriminant analysis (LDA) for identification purposes.
Utilizing equations (1-4), we extracted multiple optical parameters from samples to construct a recognition model.The absorption data underwent statistical analysis using the Linear Discriminant Analysis (LDA) algorithm.In this research, we leveraged the sample's response to the optical field as an intrinsic attribute to enhance the sample's characteristics, without expanding the sample dataset size.The dataset expanded from 11 X 50 to 22 X 50, and we applied LDA to analyze the dataset before and after introducing these attributes.In the absence of attribute supplementation, the model established four canonical discriminant functions for sample data classification, as outlined in Table 1.Notably, the cumulative percentage of the first two discriminant functions reached 92.3%, signifying their substantial representation of the dataset's information.Scatterplots in Fig. 4(a) depict each group, revealing several data points distanced from their respective group centroids but closely aligned with centroids of other groups.This proximity increases the risk of misclassification during the prediction process.Ultimately, the model achieved a 70.0%correct classification rate for the original grouping cases.After rigorous random cross-validation, 68.9% of the original cases were accurately classified, demonstrating the model's robustness.Subsequently, an analysis was conducted on the 22 X 50 dataset, resulting in canonical discriminant functions as presented in Table 2.The cumulative percentage of the first two discriminant functions reached 93.5%, underscoring their representativeness.Figure 4(b) vividly illustrates the 50 data points are distinctly partitioned into five non-interfering groups.In the final step, after rigorous random cross-validation analysis, an impressive 96.0% of the samples were correctly classified.This improvement underscores that augmenting sample features leads to enhanced efficiency in terahertz spectrum recognition when compared to the original dataset.

Conclusion
In conclusion, this study has empirically demonstrated that the combined use of Terahertz Time-Domain Spectroscopy (THz-TDS) and Linear Discriminant Analysis (LDA) for the analysis of blood glucose solutions and glucose mixtures allows for the preliminary determination of approximate concentration ranges and effectively addresses the issue of adulteration with other sugars in glucose solutions.Building on this, we propose the application of multidimensional feature spectroscopy method to collect sample information under external light controlled and extract spectral features.By constructing a multidimensional data model using the LDA algorithm, we observed a significant improvement in accuracy; the accuracy of random cross-validation increased from 68.9% to 96.0% compared to the original dataset.These findings have important practical implications.The innovative approach presented in this study provides a safer, faster, and more cost-effective alternative to traditional identification methods for blood glucose analysis and food safety assessment.The non-invasive nature of terahertz spectroscopy and the enhanced accuracy achieved through multidimensional data analysis offer promising opportunities for non-invasive blood glucose monitoring in clinical settings.Furthermore, the multidimensional feature spectroscopy method combine with the theory of terahertz vibrations of water vapor [44] can be extended to other fields, such as pharmaceutical quality control, chemical analysis, and environmental monitoring, where rapid and accurate identification of complex substances is crucial.
Funding.National Natural Science Foundation of China (12104314, 62205136); Science, Technology and Innovation Commission of Shenzhen Municipality (20200812163234001); Key Laboratory of Optoelectronic Devices and Systems of the Ministry of Education and Guangdong Province; Liaocheng University (318052316).

Fig. 1 .
Fig. 1.(a) The schematic of THz-TDS.Sample is between the emitter and the detector.(b) Light controlled.Light controlled laser beam and sample at 45 degrees.

Fig. 2 .
Fig. 2. (a) Terahertz time-domain spectroscopy of samples.(b) The frequency domain spectrum of samples.(c), (d) Corresponds to the absorption coefficient spectrum and transmittance spectrum.

Fig. 3 .
Fig. 3. (a)-(e) Optical time-domain signals of different samples.(f) The Modulation depth of the light controlled.

Fig. 4 .
Fig. 4. (a) The scatterplot matrix of canonical discriminant function in the initial state.(b) The scatterplot matrix of canonical discriminant function under light controlled states.