A Comprehensive Study on the Acidic Compounds in Gas and Particle Phases of Mainstream Cigarette Smoke

: Acidic compounds constitute a group of chemicals present in mainstream cigarette smoke, among which organic acids contribute to ﬂavoring. In order to obtain a comprehensive understanding of the constituents of acidic compounds in both the particulate and gaseous phases of the mainstream smoke of commercial cigarettes, and to delineate the difference between two types of cigarettes, the yields of acidic constituents from nine cigarettes of two commercial brands (L-and M-types) were collected and analyzed in detail by gas chromatography–mass spectrometry (GC-MS). The results identiﬁed and quantitatively analyzed 46 compounds, grouped according to the substituent groups. Compositional differences between the two cigarette types were evaluated with statistical approaches. Comparison between individual, grouped, and total acid contents, between the particulate and the gaseous phases, and between the commercial L-and M-type tobaccos were conducted and characterized by the p values obtained from Student’s t -test. Multivariate analysis was performed using principal component analysis (PCA) and orthogonal projections to latent structures discriminant analysis (OPLS-DA) models to identify the acids that enable a reliable differentiation of the two types. Seventeen acidic compounds whose p < 0.05 and variable importance in projection (VIP) > 1 were identiﬁed as key components that could discriminate between the two groups of commercial cigarettes. This study may be beneﬁcial for the development of non-combusted tobacco products, which could serve as alternatives to traditional cigarettes.


Introduction
In recent decades, a substantial body of evidence has emerged linking cigarette smoking to a range of significant health risks, including lung cancer, other forms of cancer, chronic obstructive pulmonary disease, stroke, liver disease, and coronary heart disease [1][2][3][4][5]. As a preventable cause of death worldwide, cigarette smoking is a matter of considerable public health concern. In addition to the harmful substances found in mainstream cigarette smoke, environmental tobacco smoke, which is composed of exhaled mainstream smoke and sidestream smoke, also contains known carcinogens and toxic compounds [6]. Nonsmokers who are exposed to environmental tobacco smoke over long periods are also at risk of developing lung cancer. In an effort to address these health concerns, alternative smokeless and non-combusted tobacco products, such as e-cigarettes and heat-not-burn devices, have been developed as potential substitutes for traditional cigarettes [7,8]. These products are

Sample Preparation
The cigarettes were equilibrated under conditions of 22 ± 1 • C, relative humidity 60 ± 3% for 48 h before smoking. Mainstream cigarette smoke was generated under an ISO machine smoking regimen [21] using a linear smoking machine (SM 450, Cerulean, Milton Keynes, UK). The particulate phase of the mainstream cigarette smoke was collected using 44 mm Cambridge filter pads and the gaseous phase was collected using a CX-572 cartridge with 300 mg of carbon molecular sieves. After smoking, the Cambridge filter pad was cut into two pieces and extracted with 6 mL CH 2 Cl 2 containing 50 µL of internal standard solution (9.35 mg/mL benzene-d6 and 9.06 mg/mL phenethyl phenylacetate) under sonication for 30 min. After that, the liquid extract was filtered and analyzed. Adsorbent in the CX-572 cartridge was transferred into a 15-mL vial. After adding 4 mL CH 2 Cl 2 (at a speed controlled at 1 mL/min) and 50 µL of mixed internal standard solution (9.35 mg/mL benzene-d6 and 9.06 mg/mL phenethyl acetate), the vial was shaken for 2 h and the liquid was collected afterwards.

Derivatization Procedures
Trimethylsilyl (TMS) derivation is routinely used in GC to increase the volatility and stability of the acidic compounds [22]. In this study, trimethylsilylation of the acidic compounds in the collected samples was performed using N,O-bis(trimethylsilyl)trifluoroacetamide (BSTFA) as the derivatizing agent prior to GC-MS analysis. The extract solution was filtered through a PTFE membrane with 0.45-µm pore size, 1 mL of which was transferred into a glass gas chromatography vial. Then, 100 µL of BSTFA was added to the vial, which was sealed and heated to 60 ± 1 • C for 40 min using a water bath to complete the derivatization. After that, the sample was cooled to room temperature and analyzed by GC-MS.

GC-MS Conditions
Analysis was carried out on an Agilent 7890B gas chromatography instrument coupled with an Agilent 5977C mass spectrometer (Agilent Technologies, Palo Alto, CA, USA). Separation was achieved with a J&W DB-5ms fused silica capillary column (60 m × 0.25 mm i.d., 1-µm film thickness) from Agilent Technologies, USA. High-purity helium was used as carrier gas and was maintained at 1 mL/min. The injection volume was 1 µL, and the split ratio was 10:1. The inlet temperature and the transfer line temperature were both set at 280 • C. The oven temperature program was as follows: the initial temperature was kept at 40 • C for 3 min, increased to 280 • C at a rate of 4 • C/min, and kept at 280 • C for 20 min. The spectrometer was operated in the electron ionization (EI) mode (70 eV). The scan range was set at 35-450 amu. The ionization source temperature was fixed at 280 • C. Data were collected under both the full scan and selected ion monitoring (SIM) modes. The identification of compounds was based on the National Institute of Standards and Technology (NIST) 14 library (matching score > 70) and by comparison with standards. The slopes of the calibration curves for each of the acids are also provided in Table 2.

Multivariate Analysis
Multivariate analysis of principal component analysis (PCA) and orthogonal projections to latent structures discriminant analysis (OPLS-DA) was performed using the Umetrics SIMC 14.1 software. Unsupervised PCA was firstly performed, followed by supervised OPLS-DA model. Differences between the L-and the M-groups were evaluated.

Constituents of Acids in the Particulate and the Gaseous Phases
Representative total ion chromatograms of the TMS derivatives of acidic compounds in the particulate and gaseous phases of the mainstream cigarette smoke are shown in Figure 1, and the identified peaks are listed in Table 2, where a total number of 46 acidic compounds were identified.
Contents of the individual acids are listed in Tables S1 and S2 and presented in Figure 2. Major acids in the particulate phase are carboxylic acids with straight chains, while those in the gaseous phase are palmitic acid, octadecanoic acid, acetic acid, and formic acid (Figure 2a,b). A number of acids were identified in the particulate phase of iQOS, a commercial heated tobacco product. Acetic acid, palmitic acid, linolenic acid, propanoic acid, oleic acid, linoleic acid, 3-methylpentanoic acid, 3-methyl-2-furancarboxylic acid, 2-methylbutanoic acid, 3-methylbutanoic acid, and myristic acid, which were among the identified acids in this work, were also identified in the total particulate matter of iQOS and Processes 2023, 11, 1694 6 of 13 in cigarette smoke [23]. Moreover, octadecanoic acid, pentadecanoic acid, arachidic acid, (9Z,12Z)-18-hydroxy-9,12-octadecadienoic acid, palmitoleic acid, butanoic acid, behenic acid, 5-oxo-1-tetradecyl-3-pyrrolidinecarboxylic acid, lignoceric acid, stearidonic acid, and 3-methylpalmitic acid were identified in this work. In another work, acetic acid, formic acid, glycolic acid, lactic acid, and succinic acid were among the major acids in the mainstream and sidestream of four varieties of tobacco [20]. In contrast, formic acid, acetic acid, and levulinic acid were identified in e-cigarette aerosol by high-performance liquid chromatography-high-resolution mass spectrometry (HPLC-HRMS) [24]. Clearly, the acidic constituents of cigarette smoke are highly correlated with the tested samples. It is worth noticing that, given the fact that certain weak organic acids, including but not limited to citric acid, acetic acid, butyric acid, lactic acid, and 2-methyl butyric acid, are among the permitted additives and flavorings to tobacco products [25], their addition could also contribute to the acidic components in the tobacco smoke of the commercial samples.
Processes 2023, 11, x FOR PEER REVIEW 6 of 13 behenic acid, 5-oxo-1-tetradecyl-3-pyrrolidinecarboxylic acid, lignoceric acid, stearidonic acid, and 3-methylpalmitic acid were identified in this work. In another work, acetic acid, formic acid, glycolic acid, lactic acid, and succinic acid were among the major acids in the mainstream and sidestream of four varieties of tobacco [20]. In contrast, formic acid, acetic acid, and levulinic acid were identified in e-cigare e aerosol by high-performance liquid chromatography-high-resolution mass spectrometry (HPLC-HRMS) [24]. Clearly, the acidic constituents of cigare e smoke are highly correlated with the tested samples. It is worth noticing that, given the fact that certain weak organic acids, including but not limited to citric acid, acetic acid, butyric acid, lactic acid, and 2-methyl butyric acid, are among the permi ed additives and flavorings to tobacco products [25], their addition could also contribute to the acidic components in the tobacco smoke of the commercial samples.

Acid Contents
The total amount of acid compounds in both the particulate and the gaseous phases for all the samples are presented in Figure 3. All the acids were grouped according to the substituent groups on the alkyl chain. No significant difference in the gaseous phase (p = 0.0857) was found between the two groups ( Figure 3b). Nevertheless, the high content of acids with straight or unsaturated chains is clear.
According to the chemical structure, the identified acids were categorized into eight groups, including carboxylate acid with straight alkyl chain, unsaturated acid, carboxylate acid with branched alkyl chain, phenols, acid with a hydroxyl/carbonyl substitution, alkyl chain of the acid bearing a furan ring, alkyl chain of the acid bearing a benzene substitution, and acid with more than one substituent group. In the particulate phase (Figure 3a), acids with a straight saturated or unsaturated alkyl chain dominate. For those in the gaseous phase, most have straight alkyl chains with lengths ranging from C1 to C20.

Acid Contents
The total amount of acid compounds in both the particulate and the gaseous phases for all the samples are presented in Figure 3. All the acids were grouped according to the substituent groups on the alkyl chain. No significant difference in the gaseous phase (p = 0.0857) was found between the two groups ( Figure 3b). Nevertheless, the high content of acids with straight or unsaturated chains is clear.  According to the chemical structure, the identified acids were categorized into eight groups, including carboxylate acid with straight alkyl chain, unsaturated acid, carboxylate acid with branched alkyl chain, phenols, acid with a hydroxyl/carbonyl substitution, alkyl chain of the acid bearing a furan ring, alkyl chain of the acid bearing a benzene substitution, and acid with more than one substituent group. In the particulate phase (Figure 3a), acids with a straight saturated or unsaturated alkyl chain dominate. For those in the gaseous phase, most have straight alkyl chains with lengths ranging from C1 to C20.

Acid Content Difference between the Particulate and the Gaseous Phases
Acids in the gaseous phase account for 7.6 ± 1.6% and 6.4 ± 2.6% of all those in the particulate and the gaseous phases for the L-and the M-type, respectively, whereas no significant difference was observed between the two types. Decreased acidity of tobacco emission has been reported to increase the amount of nicotine into blood [26]. In the mainstream smoke, >99% nicotine is in the particulate phase. Strong interaction between the acid compounds and the basic nicotine is expected, which hinders the emission of organic acids to the gaseous phase.
The ratio of individual acid content between the particulate and the gaseous phases of MCS was also calculated and presented in Figure 2c. Overall, organic acids with straight or branched alkyl chain, including those long-chain fa y acids, are relatively more abundant in the gaseous phase, as opposed to those acids bearing hydroxyl, furan, or phenol groups. As the boiling points of long-chain fa y acids are relatively high or equivalent to those preferably accumulating in the particulate phase, their distribution between the gaseous and the particulate phases are not simply dominated by their volatility. One possible explanation is that polycyclic aromatic hydrocarbons (PAHs), tobacco-specific nitrosamines (TSNAs), and phytosterols were found almost exclusively in the particulate phase [27], and hydrogen-bonding interactions between these compounds and acids with the above-mentioned hydroxyl, furan, and phenol groups are stronger than those with straight or branched alkyl groups.

Acid Content Difference between the L-and the M-Types
The content differences of individual acids and of each acid group between the Land the M-types are labelled in Figure 2a (individual acid in the particulate phase), Figure  2b (individual acid in the gaseous phase), and Table 3 (acid groups). In the particulate phase, significant difference (p ≤ 0.05) between the L-and the M-types was observed for The identified acids were categorized into eight groups according to the substituent groups. 1, carboxylate acid with straight alkyl chain; 2, unsaturated acid; 3, carboxylate acid with branched alkyl chain; 4, phenols; 5, acid with a hydroxyl/carbonyl substitution; 6, alkyl chain of the acid bearing a furan ring; 7, alkyl chain of the acid bearing a benzene substitution; and 8, acid with more than one kind of substituent group.

Acid Content Difference between the Particulate and the Gaseous Phases
Acids in the gaseous phase account for 7.6 ± 1.6% and 6.4 ± 2.6% of all those in the particulate and the gaseous phases for the L-and the M-type, respectively, whereas no significant difference was observed between the two types. Decreased acidity of tobacco emission has been reported to increase the amount of nicotine into blood [26]. In the mainstream smoke, >99% nicotine is in the particulate phase. Strong interaction between the acid compounds and the basic nicotine is expected, which hinders the emission of organic acids to the gaseous phase.
The ratio of individual acid content between the particulate and the gaseous phases of MCS was also calculated and presented in Figure 2c. Overall, organic acids with straight or branched alkyl chain, including those long-chain fatty acids, are relatively more abundant in the gaseous phase, as opposed to those acids bearing hydroxyl, furan, or phenol groups. As the boiling points of long-chain fatty acids are relatively high or equivalent to those preferably accumulating in the particulate phase, their distribution between the gaseous and the particulate phases are not simply dominated by their volatility. One possible explanation is that polycyclic aromatic hydrocarbons (PAHs), tobacco-specific nitrosamines (TSNAs), and phytosterols were found almost exclusively in the particulate phase [27], and hydrogen-bonding interactions between these compounds and acids with the abovementioned hydroxyl, furan, and phenol groups are stronger than those with straight or branched alkyl groups.

Acid Content Difference between the L-and the M-Types
The content differences of individual acids and of each acid group between the L-and the M-types are labelled in Figure 2a Table 3 (acid groups). In the particulate phase, significant difference (p ≤ 0.05) between the L-and the M-types was observed for 26 acids, including 2-methyl-propionic acid, 2-methyl-butyric acid, 3-methyl-butyric acid, valeric acid, 3-methyl-pentanoic acid, 4-methyl-pentanoic acid, lactic acid, glycolic acid, 2-carbonylpropionic acid, 3-hydroxy-propionic acid, and 3-hydroxy-butyric acid, etc. Among these acids, the contents of seven acids were found to be highly, significantly different (p ≤ 0.001). Most of these acids have a substituent group attached to the short alkyl chain. For the gaseous phase, the number was 12 (p ≤ 0.05) and two (p ≤ 0.001). Moreover, significant difference in acid content was observed for four acid groups identified in the particulate phase, while the number was two for the gaseous phase (Table 3). For total acid contents, no significant difference was observed between the M-and L-groups, in neither the particulate Processes 2023, 11, 1694 9 of 13 phase nor the gaseous phase. These results clearly show that the major difference between the L-and the M-types lies in the individual acid/acid groups of the particulate phase. Particulate Phase * *** *** ** Gaseous Phase * ** Note: * p ≤ 0.05, ** p ≤ 0.01, *** p ≤ 0.001.

Multivariate Analysis
Multivariate analysis with PCA and OPLS-DA methods was performed on the basis of the 46 acids identified in the particulate phase to reduce data dimensionality. PCA results (Figure 4) clearly show clustered L-and the M-groups, confirming that there are differences underlying the acidic components of the two groups. The first two principal components accounted for 88.1% (R 2 X) of the total variance. As the first principal component (PC1) and the second one (PC2) account for 50.5% and 37.6% of the variance, respectively, they can represent the majority information of the cigarette samples. The cumulated predictive ability of the model Q 2 (cum) 0.772 (>0.05) indicates the quality of the model.
Among these acids, the contents of seven acids were found to be highly, significantly different (p ≤ 0.001). Most of these acids have a substituent group a ached to the short alkyl chain. For the gaseous phase, the number was 12 (p ≤ 0.05) and two (p ≤ 0.001). Moreover, significant difference in acid content was observed for four acid groups identified in the particulate phase, while the number was two for the gaseous phase (Table 3). For total acid contents, no significant difference was observed between the M-and L-groups, in neither the particulate phase nor the gaseous phase. These results clearly show that the major difference between the L-and the M-types lies in the individual acid/acid groups of the particulate phase. Table 3. Differences in the grouped acids between cigare es samples (L-vs. M-types).

Multivariate Analysis
Multivariate analysis with PCA and OPLS-DA methods was performed on the basis of the 46 acids identified in the particulate phase to reduce data dimensionality. PCA results ( Figure 4) clearly show clustered L-and the M-groups, confirming that there are differences underlying the acidic components of the two groups. The first two principal components accounted for 88.1% (R 2 X) of the total variance. As the first principal component (PC1) and the second one (PC2) account for 50.5% and 37.6% of the variance, respectively, they can represent the majority information of the cigare e samples. The cumulated predictive ability of the model Q 2 (cum) 0.772 (>0.05) indicates the quality of the model.  Compared with the PCA model, the OPLS-DA approach is a powerful statistical modeling tool to discriminate between two groups [28,29]. The results in Figure 5a also demonstrate separation between the two groups. The R 2 X of the OPLS-DA was 0.798, indicating that 79.8% of the variation in the cigarette samples could be modeled by the selected components, while R 2 Y = 0.976 indicates that the model was well-fitted. Predictability Q 2 = 0.905 is an indication of the good predictivity of the model. In addition, diagnostics such as permutation testing are of high importance to avoid overfitting. The intercept of the blue regression line of the Q 2 -points in Figure 5b is (0, −0.838), which strongly indicates that the model is valid. These results overall proved that a combination of the acid contents in the particulate phase with multivariate statistical analysis could effectively differentiate between the two types of cigarettes. colic acid; 18, 2-carbonyl-propionic acid; 19, 2-hydroxy-butyric acid; 20, levulinic acid; 21, o-cresol; 22, 2-furancarboxylic acid; 23, 3-hydroxy-propionic acid; 24, p-cresol; 25, 3-hydroxy-butyric acid; 26, furanacetic acid; 27, 2-hydroxymethyl-butyric acid; 28, benzoic acid; 29, phenylacetic acid; 30, 3methyl-2-furancarboxylic acid; 31, catechol; 32, 2,3-dihydroxy-propionic acid; 33, quinol/resorcinol; 34, 2-isopropyl-3-carbonyl-butyric acid; 35, malic acid; 36, 1,2,3-glycinol; 37, threonic acid; 38, mhydroxy-benzoic acid; 39, vanillic acid; 40, tetradecanoic acid; 41, palmitic acid; 42, linoleic acid; 43, linolenic acid; 44, oleic acid; 45, octadecanoic acid; 46, eicosanic acid.
Compared with the PCA model, the OPLS-DA approach is a powerful statistical modeling tool to discriminate between two groups [28,29]. The results in Figure 5a also demonstrate separation between the two groups. The R 2 X of the OPLS-DA was 0.798, indicating that 79.8% of the variation in the cigare e samples could be modeled by the selected components, while R 2 Y = 0.976 indicates that the model was well-fi ed. Predictability Q 2 = 0.905 is an indication of the good predictivity of the model. In addition, diagnostics such as permutation testing are of high importance to avoid overfi ing. The intercept of the blue regression line of the Q 2 -points in Figure 5b is (0, −0.838), which strongly indicates that the model is valid. These results overall proved that a combination of the acid contents in the particulate phase with multivariate statistical analysis could effectively differentiate between the two types of cigare es.  . In (c), 1, formic acid; 2, acetic acid; 3, acrylic acid; 4, propionic acid; 5, 2-methyl-propionic acid; 6, 2-methyl-butyric acid; 7, 2-ene-butyric acid; 8, 3-methyl-butyric acid; 9, valeric acid; 10, 2-methyl-2-ene-butyric acid; 11, 2-pentenoic acid; 12, 3-methyl-pentanoic acid; 13, 4-methyl-pentanoic acid; 14 The variable importance in projection (VIP) scores reflect both the loading weights of each component and the variability of the response explained by this component. Information on the VIP of the acids is presented in Figure 5c and Table S3. In Figure 5c, the VIP plot is sorted from high to low, and the acids whose VIP is higher than one (an indication of 'important' variables) are colored in red, while those with VIP < 1.0 are in blue. A number of 24 acids, including 3-methyl-pentanoic acid, 2-methyl-butyric acid, 3-hydroxy-propionic acid, and 2-hydroxymethyl-butyric acid, were identified as main components in discriminating between the L-and the M-groups.
In general, variables with VIP > 1.0 and p < 0.05 are considered significant to contribute to the model and are identified as markers differentiating between the two groups [30]. The p values reflect the compositional difference of individual compound between groups, while the VIP value higher than one suggests that this compound contributes significantly to intergroup difference. These acids include 3-methyl-pentanoic acid, 2-methyl-butyric acid, 3hydroxy-propionic acid, 2-hydroxymethyl-butyric acid, 3-methyl-butyric acid, glycolic acid, o-cresol, 2-isopropyl-3-carbonyl-butyric acid, 2-methyl-propionic acid, malic acid, p-cresol, linolenic acid, tetradecanoic acid, caproic acid, m-hydroxy-benzoic acid, eicosanic acid, and 3-methyl-2-furancarboxylic acid, in order of decreasing VIP values. These compounds could be useful for establishing fingerprints of the tobacco products.

Conclusions
A comprehensive investigation was conducted to determine the concentration of acidic compounds in both the particulate and gaseous phases of mainstream cigarette smoke. The study involved comparative analysis of two commercially available cigarette types, namely the L-type and M-type. A total of 46 acids were analyzed qualitatively and quantitatively using GC-MS. The results indicated that the concentrations of acids in the gaseous phase were much lower than those in the particulate phase. Acids with straight or unsaturated chains were found to be more abundantly present in the gaseous phase, which could be attributed to their high volatility or weaker interaction with compounds in the particulate phase. Significant differences (p < 0.05) in acid content between the Land M-types were observed for 26 acids in the particulate phase, and 12 in the gaseous phase. Multivariate statistical analysis using PCA and OPLS-DA models successfully differentiated between the two tobacco types. These findings may be instrumental in the development of non-combusted tobacco products.
Supplementary Materials: The following supporting information can be downloaded at: https: //www.mdpi.com/article/10.3390/pr11061694/s1, Table S1: Contents of the identified acids in the particulate phase (µg/mg tar).; Table S2. Contents of the identified acids in the gaseous phase (µg/mg tar). Table S3: VIP values of individual acid.