Is tele-diagnosis of dental conditions reliable during COVID-19 pandemic? Agreement between tentative diagnosis via synchronous audioconferencing and definitive clinical diagnosis

Objectives To assess the reliability of synchronous audioconferencing teledentistry (TD) in making tentative diagnosis compared to definitive clinical face-to-face (CFTF) diagnosis; and whether agreement was influenced by dentist's experience, caller-patient relationship, and time of call. Methods All patients calling the TD hotline during COVID-19 pandemic, triaged as emergency/ urgent and referred for CFTF care were included (N=191). Hotline dentists triaged the calls, made tentative audio-dentistry (AD) diagnosis, while dentists at point of referral made the definitive CFTF diagnosis. Cohen's weighted kappa (κ) assessed the extent of agreement between AD vs CFTF diagnosis. Results There was significantly very good pair-wise agreement (κ = 0.853, P < 0.0001) between AD and CFTF diagnosis. AD diagnosis of pulpitis and periodontitis exhibited the most frequent disagreements. Tele-dentists with ≥ 20 years’ experience exhibited the highest level of agreement (κ =0.872, P < 0.0001). There was perfect agreement when mothers mediated the call (κ = 1, P < 0.0001), and very good agreement for calls received between 7 am-2 pm (κ = 0.880, P < 0.0001) compared to calls received between 2-10 pm (κ = 0.793, P < 0.0001). Conclusions Remote tentative diagnosis using AD is safe and reliable. Reliability was generally very good but varied by dentist's experience, caller-patient relationship, and time of call. Clinical significance The findings suggest that using AD in the home environment is safe and reliable, deploying providers with variable years of experience. The findings have generalizability potential to a variety of similar circumstances, healthcare settings and epi/pandemic situations.


Introduction
The global lockdown during the COVID-19 pandemic [1] affected all aspects of health care including oral health. Hence, traditional dentistry was stopped as face-to-face interactions with patients generate saliva and blood-containing aerosols, risking large-scale transmission of the virus, despite infection control measures [2][3][4][5].
A move from traditional care to virtual health technologies became critical to continue patient care and reduce transmission [5,6]. Hence, teledentistry (TD) emerged as a valuable tool to address oral and dental health issues during the pandemic [7]. It employs telecommunication and digital technology for consultation, treatment planning and dental care from different geographical locations [8,9]. It involves real-time synchronous (audioconferencing or videoconferencing) or store-and-forward (asynchronous) communication which provides excellent results for most dental applications [10]. Telephone triage is the systematic assessment of patients to provide safe and effective care by telephone [11]. In response, with the start of the pandemic, Hamad Dental Center (HDC), the public tertiary dental care provider in the State of Qatar, initiated a dedicated TD hotline to triage, consult, identify oral/dental diseases, and treat /or refer according to the condition's urgency [12].
While TD is reliable in making diagnosis [13,14], however, most studies evaluated the store-and-forward mode, with limited research on real-time (synchronous) communication. This is important, as TD during the early pandemic was not typical. Firstly, audio-dentistry (AD) during the early days of the pandemic was not uncommon [15]. AD refers to the utilization of synchronous audioconferencing approach in delivering TD services. Indeed, AD comprised a major part of our service, where most patients called via telephone, videos were not conducted, and few patients sent images because not all callers were tech-literate or comfortable with a new approach of interaction or with sending photographs.
Secondly, traditional TD comprise the patient with a trained dental professional (dental assistant/ student/ hygienist) at one end; and the consulting dentist at the other [16]. During the current pandemic, TD became directly between the dentist and the patient, sometimes with the mediation of a family member, guardian, or caregiver at the patient's end (caller-patient relationship). This new setting necessitates the training of dentists and stipulates the ability/ willingness of the mediator to administer required tasks and provide real-time feedback to the tele-dentist.
However, although telephone triage research during the pandemic explored the epidemiology of dental emergencies and management pathways [15,[17][18][19][20], no studies objectively evaluated the reliability of real-time AD diagnosis compared to direct, definitive clinical face-to-face (CFTF) diagnosis. Furthermore, to our knowledge, there are no studies that explored the influence of dentist's experience on AD reliability, despite that dentist-related factors influence diagnostic accuracy [21,22]. Similarly, no previous study evaluated the effect of caller-patient relationship on reliability. This is despite that family members can contribute to the remote patient evaluation, provide context, and help with the physical exam [23]. In addition, the relationship between the time of the call and reliability of AD has not been appraised, notwithstanding that telephone triage appears least safe after hours [24]. One study assessed the success rate of TD emergency management by evaluating improvement in patient-reported symptoms [17], however, this does not accurately appraise the reliability of AD in a pandemic context. Therefore, the aim of this cross-sectional retrospective study was to explore the reliability of using AD in making tentative diagnosis. We assessed: 1) the extent of agreement between tele-dentists' tentative diagnosis of oral and dental conditions using AD vs the established definitive CFTF diagnosis by dentists at point of referral (HDC or Hospital emergency); and, 2) whether such agreement was influenced by the dentist's experience, caller-patient relationship, and time of the call.

Ethics, design, and participants
This service evaluation project was granted permission to proceed from our hospital's Institutional Review Board (IRB). It is a retrospective analysis of data routinely collected for clinical audit and service evaluation. We defined AD as the utilization of synchronous audioconferencing approach in remote diagnosis and delivery of TD services; and clinical face-to-face (CFTF) diagnosis as the definitive diagnosis established at the point of referral after physically examining the patient. We analyzed selected caller and call characteristics and compared AD diagnosis with a gold standard definitive CFTF diagnosis for the patients who were referred for CFTF care and presented at the point of referral during the first wave of COVID-19 lockdown (5 months, 23 March-31 August 2020). During the lockdown period, 1239 patients called the hotline. Calls with incomplete records, where diagnosis was not clearly indicated, were excluded (N= 398), leaving 841 callers of which 250 were referred. Despite referral, only 223 patients showed up at the referral point.
To limit the study to AD, we excluded callers who sent self-taken mobile phone photographs of their condition (N= 32) leaving 191 callers included in this report (Fig. 1).

Setting and procedures
HDC set up a hotline to ensure the continuity of services and to avoid missing life-threatening or emergency dental conditions. Based on international recommendations [25][26][27], 11 qualified dentists (Fig. 2) prepared algorithms to triage the call, arrive at tentative diagnosis, guide management of complaints (pain, swelling, bleeding, trauma, oral-mucosal ulceration) by providing remote care and/or referral to a dental or hospital emergency facility.
We retrieved data routinely collected as part of the service using the teledentistry-data-form. This form included characteristics of the: 1) patient: nationality, age, sex, previous medical history, history of allergies; 2) call: time and duration, and caller-patient relationship; and 3) condition: chief complaint, pain severity (scale 0-10) [28], tentative diagnosis, triage category (emergency, urgent, non-urgent), dental specialty needed, and management decision (remote instructions and medications vs referral for face-to-face management). The scope of the current report is on diagnosis. For each call, the dentist completed a teledentistry-data-form and used it with the algorithms to make decisions. All 11 hotline tele-dentists received training to ensure consistency, and a dedicated workspace was utilized while caller privacy was observed.
For patients referred for physical clinical care, we retrieved the CFTF diagnosis made by dentists at the referral point and evaluated its agreement with the tele-dentists' AD tentative diagnosis.

Statistical analysis
Descriptive and inferential statistics characterized the sample. Descriptive results for continuous variables are presented as mean ± standard deviation; categorical variables as frequencies and percentages. Extent of agreement between the tentative AD diagnosis and definitive CFTF diagnosis was measured by Cohen's weighted kappa (κ) analysis and interpreted as described by Altman [29]. Weighted κ < 0.20 indicated poor level of agreement, 0.21-0.40 fair agreement, 0.41-0.60 moderate agreement, 0.61-0.80 good agreement, and 0.81-1.00 very good agreement. Statistical analyses were performed using Statistical Package for Social Sciences Version 22 (SPSS). P value < 0.05 was considered statistically significant.

Selected callers' and call characteristics
Female callers (54.5%) were slightly more than males (Table 1). More than a quarter of callers (26.7%) were >18 years, 60.2% were Qatari nationals, and 27.75% were other Arab nationals. Slightly more than half (64.4%) of the sample reported no previous medical history. More than a third of the sample required a family member/ caregiver to mediate the call. Most calls (68.6%) were during the morning (7 am-2 pm) with mean duration of 5.36 ± 3.39 minutes (range 1-20 minutes). Broken/ loose orthodontic appliance was the most frequent condition (26%) observed with patients referred for face-to-face management (Fig. 3). This was followed by pulpitis (22%) and soft tissue injury from orthodontic appliance (19%).

Agreement between tentative AD diagnosis vs definitive CFTF diagnosis
The AD tentative diagnosis made by tele-dentists was reliable when compared with definitive CFTF diagnosis (Table 2), as Cohen's weighted Kappa showed very good and significant pair-wise agreement (κ = 0.853, P < 0.0001). However, 25 out of 191 (13.09%) conditions were misdiagnosed. Conditions that were remotely diagnosed as dental abscess, dry socket, tooth luxation/avulsion, tempro-mandibular dysfunction, and salivary gland disease were least likely to be misdiagnosed. Conversely, the most frequent disagreements were related to AD diagnosis of pulpitis (12 occasions, 28% of patients with remote diagnosis pulpitis), where CFTF dentists diagnosed them as periodontal disease, tooth fracture, broken/ loose dental prosthesis/ restoration, broken/ loose orthodontic appliance, ulcer, and cyst. AD diagnosis of pericoronitis also had 3 disagreements, as CFTF dentists at point of referral identified such cases as pulpitis or cyst. Most (80%) of those with remote diagnosis of periodontal disease disagreed with the CFTF diagnosis, as these were clinically diagnosed as pulpitis, soft tissue injury from orthodontic appliance, or broken/loose dental prosthesis. AD diagnosis of tooth fracture had 2 disagreements, where CFTF diagnosis confirmed pulpitis and pericoronitis. Likewise, remote diagnosis of dental abscess had 2 disagreements, where CFTF dentists diagnosed these as tooth fracture and broken/ loose dental prosthesis/ restoration. Finally, the AD diagnosis of ulcer had only 1 disagreement as the CFTF diagnosis was broken/ loose dental prosthesis.

Extent of agreement by dentist's years of experience, caller-patient relationship, and time of call
Extent of agreement between AD vs CFTF diagnosis varied by dentists' experience, patient-caller relationships, and time of call (Table 3). Dentists with ≥ 20 years experience exhibited the highest level of agreement (κ =0.872, P < 0.0001) ( Table 3). Dentists with ≤ 5, 10-15, and 15-20 years experience still had a statistically significant very good level of agreement (κ range 0.843-0.856, P < 0.0001), while those with 5-10 years experience had good agreement that was also significant (κ = 0.678, P = 0.002).
As for patient-caller relationship, when mothers mediated the call, there was perfect agreement between the AD and CFTF diagnoses (κ = 1.0, P < 0.0001), and there was also very-good agreement when fathers mediated the call, (κ = 0.872, P < 0.0001), or when the caller was the patient (κ = 0.821, P < 0.0001). Agreement was slightly less when calls were mediated by other family members e.g., friend or caregiver, however, it was still significantly high (κ = 0.788, P < 0.0001).
The time of call was also significant. Agreement between AD and CFTF diagnosis was very good for calls received between 7 am-2 pm (κ

Discussion
Although TD is not new [30], it became essential during the pandemic to triage patients and reduce non-urgent/ essential patient attendance to limit in-person visits and minimize transmission of the virus [15]. While it is not a replacement for clinical examination, TD can be utilized to triage cases and successfully identify abnormal oral lesions [31]. Early during the pandemic, only the audioconferencing approach was initially available, hence TD was limited to telephone consultations [32], imposing challenges for telephone triage. One challenge was that AD consultations could create inadvertent misdiagnosis due to the lack of visual clinical assessment, and patients providing inaccurate information of their symptoms [32,33]. The current report explored TD callers' and call characteristics, agreement between AD vs CFTF diagnosis, and whether agreement was related to dentist's experience, caller-patient relationship, time of call and dental condition.
The main findings were that generally, Kappa coefficient exhibited significantly very good pair-wise agreement between AD vs CFTF diagnoses. However, some conditions were misdiagnosed. Dental abscess, dry socket, tooth luxation/avulsion, tempro-mandibular dysfunction, and salivary gland disease were the least misdiagnosed, while the most frequent disagreements were related to the remote AD diagnosis of pulpitis and periodontal disease. No previous reports assessed the reliability of tentative AD diagnosis during the pandemic. Few prepandemic studies evaluated real-time TD, focusing on videoconferencing rather than AD [34][35][36]. Hence, it was not feasible to directly compare our findings with other studies.
Pulpitis was the most misdiagnosed condition, comprising 48% of all misdiagnoses. This is not entirely surprising and might be explained by two reasons. First, thorough endodontic pain assessment and thermal/ mechanical testing are key diagnostic tools to assess the pulp and periapical tissues for potential endodontic pathology [37]; both are not feasible remotely. Secondly, most emergency patients with moderate/ severe pain would have already taken analgesics to control their pain [38], hence masking the 'real' endodontic diagnosis [39]. Our observed overdiagnosis of pulpitis may have led to unnecessary over-referrals of patients for CFTF evaluation and treatment, with concerns of viral transmission during shortages in personal protective equipment. Likewise, most (80%) conditions diagnosed by AD as periodontal disease disagreed with the definitive CFTF diagnosis. Generally, diagnosis of gingival inflammation is based on bleeding on probing, while periodontitis diagnosis is established on measures of probing depth, attachment level, radiographic pattern and extent of alveolar bone loss [40]. Hence, remote AD evaluation based only on self-reported dental pain and other symptoms is unable to accurately diagnose periodontal disease and could lead to over-referral of callers. Such disagreements we observed demonstrate overdiagnosis by TD of conditions that are associated with the lack of ability to objectively assess characteristics of pain or alternatively, the history, source and amount of bleeding. For conditions not associated with such symptoms and where it was easier for the patient to indicate the experienced dental problem (e.g., soft tissue injury from orthodontic appliance, broken/ loose dental prosthesis/ restoration, and broken/ loose orthodontic appliance), a perfect agreement between AD and CFTF diagnosis was observed. For TD/ AD to be utilized to its full potential, a more comprehensive guidance around its usage is needed [41] and future studies could explore whether further refining of the currently available guidelines or targeted tele-dentist training may improve AD of endodontic and periodontal conditions.
As for objective two, generally, the reliability of AD diagnosis was consistently very good for dentists with different years of experience (κ ranging between 0.872-0.843). The only exception was for those with 5-10 years experience, where agreement dropped to good (κ 0.678), probably a function of the very small number of patients that these  dentists attended to (n=9, Table 3). Such consistently very good reliability of AD diagnosis across dentists' with variable years of experience suggests that employing fresh dental graduates at the first line of AD triage could be safe and also effective in liberating valuable time of highly specialized consultants for other tasks. Such better resource allocation is feasible and probably cost-effective, particularly where safety of decision-making is enhanced by training of dentists and the use of protocols, algorithms and flow charts [42]. These findings might not be entirely surprising; for other medical disciplines, nurses achieved appropriate referral rates and telephone triage when compared with physicians and GPs on many health-related items [24,43], highlighting the potential that well trained and prepared dental assistants or hygienists could safely and reliably attend to the first line of AD triage during an epidemic. Similarly, others found that non-dental staff guided by triage protocols can successfully filter requests for out-of-hours emergency dental care [44]. Table 2 Extent of agreement between tentative audio-dentistry diagnosis and definitive clinical face-to-face diagnosis across the sample.  0  0  2  2  Total  3  36  11  4  20  9  23  13  51  8  1  2  4  3  2 191 Weighted kappa = 0.853; P < 0.0001 *1= cellulitis; 2= pulpitis; 3= pericoronitis; 4= periodontal disease; 5= soft tissue injury from orthodontic appliance; 6= tooth fracture; 7= broken/ loose dental prosthesis/ restoration; 8= dental abscess; 9= broken/ loose orthodontic appliance; 10= ulcer; 11= dry socket; 12= tooth luxation/ avulsion; 13= cyst; 14= tempromandibular dysfunction; 15= salivary gland disease; number in each cell indicates the number of patients diagnosed initially by AD telemedicine (tentative diagnosis) and CFTF (definitive diagnosis); bolded italic cells indicate number of patients with disagreement between their tentative AD and definitive CFTF diagnosis; diagonal represents number of cases with perfect agreement. T= total number of patients.

Fig. 3. Dental conditions diagnosed by hotline dentists and their frequencies
As for caller-patient relationship, family caregivers are advocates, care coordinators, and often involved in decision making to help recipients obtain needed health care resources [45]. We support such views. The current report observed that, generally, the reliability of AD diagnosis was consistently very good whether the call was by the patient or mediated by a family caregiver (κ ranging between 1-0.788). Indeed, reliability was even slightly higher when a parent mediated the call compared to when the patient made the call, in agreement that obtaining information about a child's condition from multiple informants is the ultimate assessment approach as it provides an all-inclusive picture [46]. For instance, in the current report, when the patient's mother called in, there was perfect agreement, supporting other studies where mothers accurately perceived their child's caries experience [47,48]. We also observed very good agreement when the caller was the patient's father, or the patient her/ himself, while agreement was less when other family members, friend, or caregiver mediated the call, although it was still good. This concurs with other studies that have similarly shown that during the current pandemic, parent-mediated teleassessment within the home environment was feasible for many medical conditions [49,50]. Hence, AD may be worthy in under-resourced and geographically isolated communities, where family members can contribute to remote patient evaluation [23] and improve the AD experience.
This report also found that time of call significantly influenced the agreement of AD vs CFTF diagnosis. Agreement was very good for calls between 7 am-2 pm, and good for calls between 2 pm-10 pm. Other medical disciplines observed a lack of safety after hours [24]. From the patient's side, those contacting after-hours hotline services may have severe or time-critical conditions that could influence the quality of communication and information obtained from the distressed caller; from the provider's side, working after-hour shifts has social and psychophysical implications and could impair performance efficiency [51,52]. Training for after-hour providers, interventions on the organization of shift schedules, and careful health surveillance and social support for shift workers are important preventive and corrective measures [51].
The current report has limitations. The HDC hotline was implemented when most dentists were working from home during the pandemic, hence we are unable to exclude any initial inconsistency or incompleteness of the data recording in the TD data-form. Endodontic and periodontic conditions contributed to a slightly lowered general agreement, hence future prospective studies focusing on specific dental conditions with larger samples may provide further evidence to the reliability of AD. The current report has many strengths. It assessed the extent of agreement between AD vs CFTF diagnosis for a range of dental diseases, highlighting conditions were agreement was not perfect; and appraised whether agreement was influenced by the dentist's experience, caller-patient relationship, and time of call. We are unaware of previous reports that undertook such a task.

Conclusions
TD and triage were feasible in the home setting during epidemics. TD diagnosis of dental/ oral conditions employing synchronous audioconferencing displayed very good agreement and diagnostic accuracy when compared to CFTF diagnosis. Conditions associated with acute symptoms e.g. pulpitis and periodontitis slightly reduced the reliability of AD, although it still remained high. Dentists achieved very good remote diagnostic performance regardless of their years of experience, although experienced dentists exhibited even higher agreement. Parents as call mediators were more likely to favourably influence the diagnostic reliability of AD. Morning calls displayed higher diagnostic reliability compared to afternoon/ evening calls, although the latter also displayed good agreement. The findings suggest that the use of AD in the home environment is safe and reliable, deploying providers with variable years of experience. The findings have generalizability potential to a variety of similar circumstances, healthcare settings and epi/pandemic situations.

Authors' contributions
Shaymaa Abdulreda Ali is the principal investigator for this manuscript, and has contributed to the conceptualization, data curation, formal analysis, investigation, methodology, project administration, resources, software, supervision, validation, visualization, writing original draft, writing review, and editing. Walid El Ansari is a coinvestigator for investigations reported in this manuscript and contributed to data curation, formal analysis, investigation, methodology, writing review and editing. Both authors read and approved the final manuscript.

Funding
This research did not receive any specific grant from funding agencies in the public, commercial, or not-for-profit sectors. Open access funding is provided by Qatar National Library.

Declaration of Competing Interest
Both authors confirm that there is no conflict of interest and declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.