Accuracy of two methods to detect the presence of halitosis: the volatile sulfur compounds concentration in the mouth air and the information from a close person

Abstract This study aimed to analyze the accuracy of two methods for detecting halitosis, the organoleptic assessment by a trained professional (OA) with volatile sulfur compounds (VSC) measurement via Halimeter® (Interscan Corporation) and information obtained from a close person (ICP). Methodolody Participants were patients and companions who visited a university hospital over one year period to perform digestive endoscopy. A total of 138 participants were included in the VSC test, whose 115 were also included in the ICP test. ROC curves were constructed to establish the best VSC cut-off points. Results The prevalence of halitosis was 12% (95%CI: 7% to 18%) and 9% (95%CI 3% to 14%) for the OA and ICP, respectively. At the cut-off point >80 parts per billion (ppb) VSC, the prevalence of halitosis was 18% (95%CI: 12% to 25%). At the cut-off point >65 ppb VSC, sensitivity and specificity were 94% and 76%, respectively. At the cut-off point >140 ppb, sensitivity was 47% and specificity 96%. For the ICP, sensitivity was 14% and specificity 92%. Conclusions VSC presents high sensitivity at the cut-off point of >65 ppb and high specificity at the cut-off point of >140 ppb. ICP had high specificity, but low sensitivity. The OA can express either occasional or chronic bad breath, whereas the ICP can be a potential instrument to detect chronic halitosis.


Introduction
How can halitosis be measured and diagnosed?
How can people know whether or not they have bad breath? These and other inquiries have fueled several attempts to develop halitosis detector instruments in the last century. 1 Halitosis is a universal symptom of important social impact, which occurs in the chronic and occasional form. [2][3][4] It occasionally affects about 15% to 58% of the population and probably around 15% (95% CI 11% to 19%) of the population present bad breath constantly [5][6][7] . Inconsistencies between estimates of halitosis prevalence probably result from the different methods and criteria used to define the presence of halitosis. 8,9 Population survey data in Japan indicated that time of day is an important factor to consider in halitosis prevalence estimates. That survey detected a higher frequency of bad breath in the morning compared to the afternoon and also in individuals with more than 2.5 hours after the last oral activity (food intake, brushing, etc.). 10 Halitosis can be measured by organoleptic measurements, portable clinical devices (eg.
OralChroma, Halimeter ® ), gas chromatography analysis, and even by interview (eg. HALT questionnaire). 11 The breath odor organoleptic assessment by a trained professional (OA) is considered the gold standard of halitosis diagnostic methods. [12][13][14] However, due to concerns with its accuracy and reproducibility, influenced not only by the degree of aspiration (fast or long), but also by the interference from other factors (psychological, cultural, physiological, among others). Therefore, alternative methods that offer greater comparability are needed. 12 In this context, more objective instrumental measures are desirable, although the accuracy of such diagnostic methods is still questionable. 12,15 One portable breath meter is the Halimeter ® (Interscan Corporation -https://www.halimeter. com/). This instrument accurately reflects the levels of hydrogen sulfide; however, other volatile sulfur compounds (VSC) detected by gas chromatography may be underestimated. The Halimeter ® is not able to differentiate the types and concentrations of sulfuric components in breath samples and, in addition, it needs periodic recalibration. The reproducibility of this method has been considered satisfactory, especially for stationary VSC levels. 12,16 Interviews with participants are infrequent in oral odor investigations, probably since halitosis selfassessment is a questionable method for bad breath diagnosis. [17][18][19][20] Interview with a close person (ICP) to detect halitosis is rare in scientific literature. One study estimated that the prevalence of halitosis using A total of 138 participants were eligible for the VSC study and, of this group, 115 people met all the inclusion criteria and were included in the two analyses, the VSC and the ICP. The stages of data collection are described in Figure 1.

Test-retest reliability
For the ICP test-retest reliability study, the participants were randomly selected and reinterviewed by phone, fifteen days after the first interview at the hospital. The selection process started in the second semester of the study period and consisted of drawing one week every month of data collection. All participants included in these randomly selected weeks also participated in the reliability study. The retest weeks were drawn by generating random numbers from the statistical software OpenEpi Version 3.01.
Of the 53 participants selected for ICP test-retest reliability study, 13 were lost (25%). The losses that occurred were due to the incorrect telephone number information of the close person or to the impossibility of locating the person after three attempts at alternate times. In the end, 40 individuals were included in the reliability study.

Definition of halitosis
For the gold standard, OA, the subject's halitosis was measured by organoleptic assessment by a trained professional and was classified into one of the five following scores: 0-no halitosis or very good breath odor; 1-mild halitosis or good breath odor; 2-moderate halitosis or moderate breath odor; 3-severe halitosis or bad breath odor; 4-very severe halitosis or very bad breath odor. Two different cut-off points were established to define the presence of halitosis: a specific cut-off point, which considers scores 3 and 4 as the presence of halitosis; 2) a sensitive cut-off point, which considers scores 2, 3, and 4 as the presence of halitosis.
During the organoleptic examination, the participant was instructed to keep the mouth closed for one-minute breathing through the nose, and then count from one to ten, with the mouth at a distance of approximately 20 centimeters from the trained professional's nose.   Random selection logistics Observer "1" chooses which of the two participants will be the first to go through the OA evaluation.
The pair of participants is taken to the place (reserved room) where observer "2" waits for the beginning of the organoleptic measurement. The participants enter the room (one at a time, following the order of the previous draw), undergo a professional organoleptic assessment, and, then, Halimeter is used to measure VSC levels .

ICP survey
After the OA and VSC data collection, the ICP data were obtained by face-to-face interviews with Two weeks after the first ICP, a random sample of people who answered the first questionnaire was selected for test-retest and the same interview was conducted by telephone.

Data analysis
The prevalence of halitosis and 95% confidence intervals (95%CI) were calculated and reported according to each measurement strategy, that is, OA, VSC, and ICP.
For the test-retest reliability analysis of the ICP, simple and weighted Kappa coefficients (quadratic weighting) were calculated. 23 To define the VSC cut-off points, a receiver operation characteristic curve (ROC) was constructed,
The greatest agreements were observed between the "good odor" scores. The greatest disagreements occurred between the "good odor" and "moderate odor" scores, that is, the individual's breath was defined as "good odor" in the first interview but as "moderate odor" in the second interview. There was a high coefficient for a global agreement. The simple Kappa coefficient, calculated at the specific cut-off   is observed in all analyzed cut-off points. By increasing the test cut-off point, a percentual reduction of false positives is observed at the expense of an increase in the false negatives percentage.       ICP and OA specific cut-offs. The sensitivity in both tests was less than 0.5.

Discussion
Previous studies have estimated that the prevalence of halitosis varies between 2% and 49% in different   Percentage of 1 false positive and ²false negative.
Interview with a close person scores cut-off points: 1) specific cut-off point (score>2) = very severe or severe halitosis; 2) sensitive cut-off point (score>1) = very severe, severe, or moderate halitosis  Organoleptic assessment scores cut-off points: 1) specific cut-off point (score > 2) = very severe or severe halitosis; 2) sensitive cut-off point (score > 1) = very severe, severe, or moderate halitosis Organoleptic assessment and interview with a close person scores cut-off points: 1) specific cut-off point (score>2) = very severe or severe halitosis; 2) sensitive cut-off point (score>1) = very severe, severe, or moderate halitosis Table 6-Accuracy of the interview with a close person (ICP) compared to organoleptic assessment by a trained professional (OA) Accuracy of two methods to detect the presence of halitosis: the volatile sulfur compounds concentration in the mouth air and the information from a close person J Appl Oral Sci. 2023;31:e20220412 9/11 halitosis found in a previous study that relied on the information from a close person. 7 The VSC was the test that stood out the most, both to detect negative (specificity: 96%, cut-off 140 ppb) and positive cases of halitosis (sensitivity: 94% cut-off 65 ppb). Although the ICP was also accurate in capturing truly negative cases of severe or very severe halitosis (specificity: 92%), it presented the worst sensitivity (<50%). Previous studies found a sensitivity that varied from 52% to 90% and specificity from 45% to 90% for VSC. 6,28,29 The diagnostic strategy aiming for maximum specificity of both VSC and ICP seems to be more appropriate when the main concern is to avoid a falsepositive result. This situation can be interesting in cases that the patient, supposedly with bad breath, goes to a specialized gastroenterology service, assuming that On the other hand, the strategy aiming at greater sensitivity in halitosis measurements may fit in cases that require simple treatments that are easy to perform. This may be the case of a patient who seeks dental service with a complaint of halitosis. A test with high sensitivity, even though it generates a higher percentage of false positives, might be a more appropriate alternative since most cases of halitosis are solved with simple and low-complexity procedures, such as changing habits of oral hygiene, among others. 32-35 It is noteworthy that about 90% of halitosis cases originate in the oral cavity, which provides a suitable environment for bacterial growth.
These bacteria are mainly retained on the tongue and periodontium and can cause halitosis. 3 In the present study, the area under the ROC curve ranged from 78 to 89%, but a previous study found an area under the ROC curve of 67% for VSC. 12 The VSC test in the present study showed both high sensitivity (94%) at the 65ppb cut-off point and high specificity (96%) at the cut-off point of 140ppb. The Kappa agreement coefficient ranged from 0.40 to 0.48.

Previous studies reported agreement between VSC
and OA that varied around 0.60. 12,28,36  not with a random sample of the general population.
However, even if there was difference between the halitosis profile of the participants and general population, it would not interfere with the main purpose of this study, which was to assess the accuracy of the VSC and the ICP, compared to the OA.
A second limitation is the lost 25% of the invited patients to participate in the test retest. What could explain this, in addition to the two-week interval between the first and second interviews, is the fact that the second interview was by telephone.
A third limitation is that all patients were fasting, which could have contributed to an increase in the prevalence of halitosis. However, it is unlikely that this influenced the accuracy of the halitosis measuring methods.
No information was collected on participants' eating habits, medications, oral health, and oral hygiene habits. Although these factors may contribute to the presence of halitosis, we have no reason to believe that they interfered with the accuracy study.
Sensitivity, specificity, and predictive values are useful measures in the evaluation of diagnostic tests.
However, clinical benefits, economic burdens, and advantages and disadvantages over other tests also need to be considered. Knowledge of the techniques for validating and interpreting diagnostic tests is, therefore, essential for health professionals so that they can guide their decisions about the real usefulness of tests on a scientific basis.

Conclusion
The VSC presents high sensitivity at the cut-off point of >65 ppb and high specificity at the cut-off point of >140 ppb, however, the best test characteristics were detected at the cut-off point of > 80 ppb (Sensitivity: 0.65 and specificity 0.88). ICP had high specificity, but low sensitivity. The OA can express either occasional or chronic bad breath, whereas the ICP can be a potential instrument to detect chronic halitosis.

Data availability statement
The datasets generated and analyzed during the current study are not publicly available due to ethical issues, data underpinning this publication cannot be made openly available according to Brazilian legislation.