Detection of single-base mutation of DNA oligonucleotides with different lengths by terahertz attenuated total reflection microfluidic cell

: Many human genetic diseases are caused by single-base mutation in the gene sequence. Since DNA molecules with single-base mutation are extremely diﬃcult to diﬀerentiate, existing detection methods are invariably complex and time-consuming. We propose a new label-free and fast terahertz (THz) spectroscopic technique based on a home-made terahertz attenuated total reﬂection (ATR) microﬂuidic cell and a terahertz time-domain spectroscopy (THz-TDS) system to detect single-base-mutated DNA molecules. The detected object DNA molecules are normal hemoglobin gene, sickle cell anemia gene (15 nt), JAK2 gene wild type and JAK2 V617F gene mutation (39 nt) from sickle cell anemia and thrombocytopenia, respectively. Results show that the oligonucleotide fragments with single-base mutation can be identiﬁed by THz spectroscopy combined with the ATR microﬂuidic cell, and the recognition eﬀect of short oligonucleotide fragments with single-base mutation is better than that of long oligonucleotide fragments. The terahertz biosensor is shown to have high sensitivity and can be used to detect DNA molecules directly in the solution environment.


Introduction
DNA is one of the most important biological macromolecules, which plays an essential role in life activities. As the carrier of genetic information, it participates in the transmission and expression of genetic information in cells, so as to promote and control the metabolic process. DNA detection not only provides important information for scientific researches [1,2], but can also be employed for early diagnosis, treatment and prognosis of diseases (especially tumors) [3,4]. As such, DNA detection has been widely used in all aspects of biomedicine, and is one of the indispensable detection methods in basic medical research and clinical disease treatment [5].
Many human diseases, such as thalassemia and a large variety of tumors, are caused by single-base mutation in the gene sequence. These single-base mutation can be used as biomarkers and are very useful for early medical diagnosis of diseases [6]. Single-base mutation is a point mutation that occurs at a specific position in a genome and constitutes the most common form of genetic variation. The number of occurrence of single-base mutation is relatively large. In the human genome, about one base may mutate for every 100-300 bases [7]. Single-base mutation may affect gene function resulting from amino acid substitution, modification of gene expression or alteration of gene splicing, and is closely associated with various common diseases and individual differences in drug metabolism [8]. Although numerous techniques for point mutation detection have been proposed to date, most of these approaches require target amplification, typically with polymerase chain reaction (PCR) [9][10][11]. Additional efforts are thus needed to explore more broadly applicable methods for sensitive, accurate, rapid, and low-cost single-base mutation identification [12][13][14].
At present, there are a number of detection methods of single-base mutation, and new methods are emerging. Most of these methods can be regarded as comprising two separate steps: distinguishing the specific site, and detecting the change and analyzing the data. The distinguishing point of single-base mutation is mainly achieved by hybridization, PCR, molecular conformation and enzyme method. Quantitative detection methods mainly include gel electrophoresis, fluorescence, DNA chip and mass spectrometry. Thus, most single-base mutation detection is a combination of the above two categories of methods. For example, the task of distinguishing a specific site based on DNA hybridization can be accomplished by a combination of fluorescence method [15] and DNA chip technology [16].
Sickle cell anemia (SCD) is an autosomal recessive genetic disease, which belongs to βhemoglobin disease. β-hemoglobinopathy is caused by the mutation of the HBB gene, which is the most common single-base mutation genetic disease affecting humans. SCD is caused by the mutation of the 6 th codon, GAG, of the HBB gene into GTG: This promotes the replacement of the 6 th glutamate (Glu) residue of the HBB protein by the valine (Val) residue, and subsequently the normal hemoglobin is replaced by a sickle hemoglobin S (HbS) [17]. Under the condition of reduced oxygen, HbS is prone to polymerization, which leads to sickle, hemolysis and aggregation of red blood cells and adhesion of white blood cells in the microvascular system, and eventually leads to vascular occlusion. This process will lead to various serious sequelae. The main clinical manifestations of SCD are chronic hemolytic anemia, predisposition to infection, recurrent pain crisis and tissue and organ damages caused by chronic ischemia.
Primary thrombocytopenia (PT) is a kind of myeloproliferative diseases (MPD) characterized by Ph chromosome-negative. JAK2V617F gene plays a key role in the growth of human hematopoietic factors. The mutation is very important for the study of the pathogenesis of MPNs [18]. The mutation occurs at the 1849 th position of the gene. The original guanosine is replaced by thymine, which leads to the missense coding of valine into phenylalanine [19]. Then a series of mechanisms lead to an over-sensitivity of hematopoietic cells to growth factors, resulting in abnormal cell proliferation.
Early, accurate, simple and rapid identification of these single base mutation is of particular importance for the pathogeny and early therapy of the corresponding diseases. To that end, a new detection method for single-base mutation has been developed in this work. Terahertz spectroscopy is maturing into a versatile tool for scientific researches in recent years. Its band lies between microwave and infrared regions of the electromagnetic spectrum, and its approximate frequency range is 0.1 to 10 THz. Because terahertz radiation has no known damage to biological system at moderate power levels [20][21][22], and it can provide conformational information of biological molecules closely related to biological functions in cells, which is difficult to be obtained by other optical, X-ray and nuclear magnetic resonance spectroscopy techniques [23]. However, for most of the time that it has been attempted, there has been a bottleneck in applications of terahertz spectroscopy technology in biomedicine resulted from the strong absorption of terahertz wave due to water, which overwhelms most of the terahertz spectral characteristics of biomolecules in the solution, whereas biomolecules can only reflect their conformational information in solution. In order to overcome this bottleneck, one effective way is to confine the active biomolecules into a micro-nano structure, which can not only reduce the strong terahertz absorption of water in solution [24,25], but also more closely simulate the conformation of biomolecules confined in the cell, so as to enhance the THz signal and improve the detection accuracy and sensitivity [26][27][28]. In this study, an attenuated total reflection (ATR) microfluidics was used as the loading device of liquid samples. Based on the microfluidics structure, the ATR mode is used to replace the transmission mode. The influence of thickness on the subsequent calculation was deducted and the standing-wave interference was eliminated.
In this work, the oligonucleotides of sickle cell anemia and thrombocytopenia caused by single-base variations are selected as the research targets. THz-TDS combined with ATR microfluidics detection is applied to this biomedical diagnosis research, which confirms the potential value of THz-TDS in clinical disease detections.  (Table 1) were synthesized and then purified by HPLC (high-performance liquid chromatography) at Sangon Biotech Co., Ltd. (Shanghai, China). They were dissolved separately in a TE buffer at pH 7.2 with concentration of 5 µg/µL and 0.5 µg/µL.  The cell has 2 pieces of quartz windows with a depth of 100 µm. The inlet and outlet holes are at the edge of the circle. The inner dimension of the cell is 13 mm. The outer ring is used to store excess liquid, as shown in Fig. 1(a). Before each THz spectroscopy measurement, the liquid sample was gently injected into the circular reservoir through the inlet hole by pipette. The reservoir was fully immersed by the liquid sample and the excess liquid sample flowed into the outer loop. The volume of the reservoir is around 14.0 µL. Attenuated total reflection (ATR) facilitates the attainment of the interaction information between the evanescent THz wave and measured object on the surface of the prism where total internal reflection occurs. The evanescent wave propagates along the tangent direction of the interface and its amplitude decreases exponentially with the distance away from the interface. In terahertz band, the penetration depth of evanescent wave (the penetration distance when the amplitude attenuates to 1/e of the incident light amplitude) is generally tens of microns. Therefore, when the sample to be measured is placed on the surface of the ATR prism, the evanescent wave transmits into the bottom layer of the sample through the prism, thus the emitted terahertz signal will mainly reflect the physicochemical properties of the bottom layer of the sample in THz band (as shown in Fig. 1(b)).

Experimental preparation
With the same detecting instruments, the penetration depth of the evanescent wave is only related to the refractive indices of the prism and the liquid sample and the THz wavelength in the sample, but not the thickness of the sample. This effectively circumvents the problem that the transmission detection method encounters as the latter needs to accurately control and measure the sample thickness. Furthermore, the THz-ATR scheme presented here also improves the detection accuracy and simplifies the pre-processing process of the sample. At the same time, such reflection-based detection setup will not suffer from interference of the standing wave resonance, which simplifies the post-processing of data.
The prism material is selected based on its high transmission and refraction characteristics in the terahertz band. At present, the power of terahertz source is weak and the light spot of the terahertz source is relatively large. In order to couple as much radiation energy into and out of the prisms as possible for a given THz band, an appropriate thickness of the prism should be selected. Then, according to the dielectric properties of the samples on the prism surface, the critical angle for total internal reflection is calculated to determine the bevel angle of the prism. The sample cell is divided into two parts, one is the bearing cell on the prism surface, which is integrated with the prism, and the other is the cover as shown in Fig. 1(b). The cover of the sample cell is made of PDMS. A hexagonal groove with a depth of 200 µm is engraved on the inner side of PDMS. The sample inlet and outlet are set at the two opposite corners of the rectangular groove, as shown in Fig. 1(c). The liquid sample was gently injected into the hexagonal groove through the inlet hole by a syringe connected to a plastic pipe. The groove was fully immersed by the liquid sample and the excess liquid sample flowed into the outlet pipe.

THz spectroscopy measurement
A commercial THz-TDS system (Tera K15, Menlo Systems GmbH, Munich, Germany) with the transmission mode was utilized to measure the THz spectra of the DNA oligonucleotide solutions in the liquid cell and THz ATR microfluidic cell. Briefly, a femtosecond laser with a center wavelength of 1560 nm, a repetition rate of 100 MHz and a pulse width below 90 fs was split into the pump beam and the probe beam. The pump beam was irradiated on a biased photoconductive (PC) antenna to emit THz pulse that was pre-collimated by a high-resistivity silicon lens attached behind the PC antenna. Then the pulse was collimated by a polymethylpentene (TPX) lens with an effective focal length of about 50 mm, and focused by another TPX lens. THz pulse transmitted through the sample was detected via a process nearly reverse to the process of THz emission. An optical delay line in the probe optical path was utilized to sample the THz pulse with a time interval of 33.3 fs and the sampled data were used to rebuild the THz time-domain pulse signal. A relative humidity of 3% was maintained by the purge of nitrogen gas during measurements.

Calculation of absorption coefficient
According to the Beer-Lambert Law, the absorption coefficient of the oligonucleotide solution sample after subtracting the contribution from the buffer solution was calculated as Where I buff and I s are power transmissions of buffer solution and the oligonucleotide solution, respectively, and d is the thickness of the liquid cell (d = 0.1 mm). The calculation method of THz absorption coefficient of DNA oligonucleotide solutions in the ATR microfluidics has been established. If the electric field amplitude of the incident terahertz wave isẼ in , according to the law of reflectioñ Wherer ref is the reflection coefficient when the liquid sample cell is empty andr sam is the reflection coefficient after the sample is injected. According to Fresnel's law of reflection, the reflection coefficientr 12 andr 23 of silicon-microfluidics interface and microfluidics-PDMS interface under P polarization have been established. The total reflection coefficientr of the terahertz wave incident on the ATR microfluidics is r =r Where d is the microfluidics depth, ε Si is the dielectric constant of the silicon prism,ε is the complex dielectric constant of the sample on the total reflection surface. When the sample is not injected into the microfluidics,r =r ref ; when the sample is filled into the microfluidics,r =r sam . The complex permittivity of the liquid sample can be obtained by combining Eqs. (2) and (3). Furthermore, the absorption coefficient of the sample is obtained as Where ν is the frequency, c is the speed of light, and ε is the imaginary part of the complex permittivity.

THz spectroscopy analysis of oligonucleotides based on conventional liquid cell
The THz spectra of four oligonucleotides (normal hemoglobin gene, sickle cell anemia gene, JAK2 gene wild type, and JAK2 V617F gene mutation) in TE buffer solution loaded into the liquid cell ( Fig. 1(a)) were measured using the THz-TDS system. Each oligonucleotide solution sample was repeatedly measured three times on different days to minimize the fluctuation of instrument performances. Before each measurement, the commercial sample cell was first cleaned in anhydrous ethanol and deionized water, and then dried in nitrogen for three times. The absorption coefficient of each sample in the frequency range of 0.3-1.4 THz is shown in Fig. 2. From this figure, we can see that there is no characteristic absorption peak for all samples, and there are significant differences in THz absorption coefficients between normal hemoglobin gene and sickle cell anemia gene solutions with 5 µg/µL concentration. In such a frequency band, we can clearly see that the difference in absorption coefficients between the normal hemoglobin gene and sickle cell anemia gene is greater than that between the wild type of JAK2 gene and the mutant type of JAK2 V617F gene. There is some overlap of the error bars from Fig. 2(b), but the oligonucleotides of healthy genes and mutant genes from sickle cell anemia can nevertheless be differentiated from the averaged absorption intensity over all frequencies. In addition, to demonstrate the system's capability for trace detection, the sample concentration was diluted 10 times and measured again. In other words, when the concentration is 0.5 µg/µL, the absorption coefficients of four oligonucleotide solutions in the frequency range of 0.3-1.4 THz were obtained, and shown in Fig. 3. It was found that the difference in THz absorption between the four oligonucleotide solutions decreased and became no longer significant. It can be seen from the figure that the difference in the absorption coefficient of oligonucleotides between the normal hemoglobin gene and the sickle cell anemia gene is close to that between the wild type and the mutant type of JAK2 V617F gene at 0.5 µg/µL. However, it is worth noting that the difference in absorption coefficient between the four oligonucleotide solutions is not prominent due to the strong water absorption. It can be concluded that the absorption coefficients at low concentration (0.5 µg/µL) and high concentration (5 µg/µL) solution have the same change trend for the four oligonucleotide chains from Figs. 2 and 3. Small difference in the absorption coefficients of the same oligonucleotide chain at different concentrations and the different oligonucleotides at the same concentration can be observed from the insets in Figs. 2 and 3. The absorption coefficient itself is believed to be a manifestation of the interplay and competition between hydration layer absorption and free-water absorption [29][30][31]. It is speculated that the different absorption coefficient is indicative of different hydration layer around the oligonucleotide. The differences in the THz absorption coefficient between different concentrations and between different DNA specificities may be related to the number of hydrogen bonds formed around the oligonucleotides. Long chain oligonucleotides are more likely to form secondary structures in the solution of low concentration, and a thicker hydration layer is more likely to gather around the secondary-structure conformation, which may give rise to stronger terahertz absorption [32][33][34][35]. This is consistent with our previous experimental results [36].

THz spectroscopic analysis of oligonucleotide based on the THz ATR microfluidic cell
Based on the conventional THz liquid cell, the recognition effect for two groups of healthy and mutated genes is not ideal at the lower (0.5 µg/µL) concentration. Because the THz liquid cell is used as the carrier of DNA molecular solution, the transmission signal will be heavily affected by the sample thickness. Hence it is proposed to use the THz attenuated total reflection microfluidic cell as the carrier of DNA molecular solution in the experiment. Using the same detection instrument, the evanescent wave penetration depth is only related to the sample refractive index and the THz wavelength, but not to the sample thickness. This avoids the problem that the transmission detection method needs to control and measure the sample thickness accurately and simplifies the sample pre-treatment process. At the same time, such a reflection detection scheme will not bring in the interference effect of standing wave resonance, which simplifies the data post-processing as well. THz spectra of the four oligonucleotide samples (normal hemoglobin gene, sickle cell anemia gene, JAK2 gene wild type, JAK2 V617F gene mutation) in TE buffer solution loaded into the THz ATR microfluidic cell were measured using the THz-TDS system. Each oligonucleotide solution sample was repeatedly measured three times. Each oligonucleotide solution sample was measured three times on different days. In order to ensure the stability of the system, the bearing cell on the prism is mounted in the light path. When the sample is replaced, the cover is removed. Before each measurement, the bearing cell and the cover of the self-developed ATR microfluidic cell can be cleaned separately. The bearing cell was first cleaned in alternate of anhydrous ethanol and deionized water, and then dried in nitrogen for three times. The cover of was first cleaned ultrasonically in baths of anhydrous ethanol and deionized water for 3 min respectively, and then dried in nitrogen. The silicon surface of the bearing cell can adhere to PDMS after cleaning.
Four oligonucleotides with concentration of 0.5 µg/µL were used for detection. The absorption coefficients of four oligonucleotide solutions in the frequency range of 0.3-1.4 THz are shown in Fig. 4. It was found that there were some differences in THz absorption between the normal hemoglobin gene and sickle cell anemia gene solutions. It can be seen from the figure that when the oligonucleotide is 0.5 µg/µL, the difference in the absorption coefficient between the normal hemoglobin gene and the sickle cell anemia gene is greater than that of the wild type of JAK2 gene and the mutant of JAK2 V617F gene, and the difference in the absorption coefficient between the normal hemoglobin gene and sickle cell anemia gene with 15 nt is more prominent than that with 39 nt, with less error-bar overlapping.
To ascertain that the difference is meaningful statistically and can be used to discriminate the four samples, a principal component analysis (PCA) is performed using more than 30-times measurement data for each oligonucleotide solution to classify the four groups of oligonucleotides. PCA is an unsupervised classification method, which is often applied to spectral data, given that the variables are numerous and often correlated. PCA can reduce the dimension of the data as far as possible without losing important information of the original data [37][38][39]. Figure 5 shows the three-dimensional view of the THz spectral principal component analysis of these four single-base-mutation DNA oligonucleotides. The cumulative percentage variance of the first three principal components is 99.23%. PC1 interprets 95.85% variance, PC2 2.77% variance, and PC3 0.61% variance. The 3-dimensional space represented by the first three PCs' score included 99.23% of the information from the original THz data, covering an overwhelming majority of useful information thereof. Based on the principal component analysis, quadratic discriminant analysis (QDA) was employed in classification of these four oligonucleotides. QDA is a best-known discriminant analysis approach, which has been successfully used for appraisement in various fields [40,41]. The quality of the classification of QDA models is predicated on the number of principal components: more principal components result in higher recognition rate. In the present case, the optimal QDA model was generated when 9 PCs were employed. Figure 6 shows the best identification results of QDA models with 9 principal components. The classification rate by cross-validation were 100% in the calibration set and 90.3% in the prediction set. All 80 samples were correctly classified in the calibration set; 47 of 52 samples were correctly classified, and 5 samples from the long chain oligonucleotides were wrongly classified in the prediction set, which affirms the understanding that the longer the chain, the higher the conformational similarity of oligonucleotides with single-base mutation in solution. From the cluster results of the first three principal components and identification results of QDA, we can see that the recognition efficacy of two groups of healthy and mutated genes using the ATR microfluidic cell is better than that of oligonucleotides with the same concentration in the conventional liquid cell. The results also show that the recognition rate of short single-base mutation oligonucleotides is higher than that of the long ones. It is worth noting that the detection results using the THz ATR microfluidic cell and the conventional liquid cell are otherwise quite consistent. It can also be observed from Fig. 4 that the absorption coefficients of long-chain oligonucleotides (39 nt) were higher than those of short-chain oligonucleotides (15 nt) at the same concentration and frequency, which is again consistent with the detection results using the conventional liquid cell. Furthermore, the recognition efficacy of the ATR microfluidic cell was markedly better than that of the conventional liquid cell for low-concentration trace substances, because the former was not affected by the uncertainty of sample thickness determination and standing wave interference.

Conclusions
Detection of single-base mutation in DNA molecules is an important part of analysis of diseases of genetic origin. However, differentiation of DNA molecules with single-base mutation is notoriously difficult-the corresponding discrepancies in physical, chemical and other detectable properties are usually miniscule-and the few available detection methods are convoluted and time-consuming. As THz detection of biomolecular solutions with conventional liquid cells is strongly affected by the sample thickness, we proposed and demonstrate a self-developed terahertz attenuated total reflection microfluidic cell to detect single-base mutation DNA molecules (normal hemoglobin and sickle cell anemia genes with 15 nt, JAK2 wild type and JAK2 V617F mutants with 39 nt) using THz time-domain spectroscopy. The proposed THz biosensor technology has high sensitivity. It only needs a small amount of samples to differentiate DNA molecular solutions with different mutation and length. The THz absorption spectra of DNA oligonucleotides with single-base mutation were obtained and their difference was observed and analyzed by the principle component analysis and the quadratic discriminant analysis methods. The recognition rate of short oligonucleotides with single-base mutation is higher than that of long ones. Our results indicated that the DNA oligonucleotides with single-base mutation can be clearly identified by THz spectroscopy employing our attenuated total reflection microfluidic cell.

Disclosures
The authors declare no conflicts of interest.