Abstract

The detection of mesiodens supernumerary teeth is crucial for appropriate diagnosis and treatment. The study aimed to develop a convolutional neural network (CNN)-based model to automatically detect mesiodens in cone-beam computed tomography images. A datatest of anonymized 851 axial slices of 106 patients’ cone-beam images was used to process the artificial intelligence system for the detection and segmentation of mesiodens. The CNN model achieved high performance in mesiodens segmentation with sensitivity, precision, and F1 scores of 1, 0.9072, and 0.9513, respectively. The area under the curve (AUC) was 0.9147, indicating the model’s robustness. The proposed model showed promising potential for the automated detection of mesiodens, providing valuable assistance to dentists in accurate diagnosis.

1. Introduction

The permanent and primary dentitions both have supernumerary, although the permanent dentition has a higher occurrence [1]. Genetic and environmental variables are considered to play a role in the etiology of supernumerary teeth; however, the precise reason is still unknown. Moreover, supernumerary teeth are thought to be linked to hereditary disorders such as cleft lip and palate, cleidocranial dysplasia, and Gardner syndrome [1]. The mesiodens, which are found in the premaxillary area, are the most prevalent supernumerary teeth [2]. The mesiodens might appear as a single entity or multiple [1, 3]. Mesiodens are divided into four categories based on their morphology: conical, molariform, supplemental, and tuberculate [3, 4]. The mesiodens can cause root resorption, rotation, or displacement of the adjacent teeth; it can also cause delayed permanent tooth eruption, midline diastema, nasal floor perforation, and other complications that may require surgical or orthodontic treatment [1, 35].

Mesiodens can be detected by the doctor because of the patient’s complaint or during a routine radiographic examination. Panoramic radiographs, which have a wide imaging field that shows the maxillary and mandibular dental arches combined, are employed as an auxiliary diagnostic tool in clinical examination [6]. However, superpositions, particularly in the midline region where mesiodens are observed, may challenge the identification of mesiodens. Furthermore, due to the two-dimensional nature of this imaging technology, unequal magnification, and geometric distortions, the relationship of the mesiodens with adjacent teeth and anatomical formations may be misinterpreted [6]. Therefore, three-dimensional imaging methods must be used to precisely examine the morphology, its location, and its effect on the surrounding tissues. Cone-beam computed tomography (CBCT) has been established as an effective diagnostic method for the detection and detailed evaluation of mesiodens. CBCT is a useful imaging technique for analyzing three-dimensional images of mesiodens in the coronal, sagittal, and axial planes and planning the next course of action [7].

Dentists must be trained to appropriately interpret CBCT images, which is not always possible. Many details may be overlooked while interpreting CBCT images due to the clinician’s lack of radiological competence, time restrictions, or lack of experience. Therefore, computer-aided systems (CADs) have been developed to aid in radiographic evaluation. CADs have been used to identify anatomical formations and some pathologies in the maxillofacial region [8]. Artificial intelligence systems based on convolutional neural networks (CNN) and deep learning (DL) approaches have recently gained popularity thanks to recent developments in object detection, image classification, and segmentation [9]. These methods allow for the automatic learning of appropriate image attributes for a variety of activities. To fully automate the detection of anomalies or anatomical structures [10], deep learning-based AI systems are currently being developed in our study.

The objective of this study is to use a CNN-based model to detect mesiodens in axial sections of CBCT images and to evaluate the performance of this diagnostic method based on convolutional neural networks. Furthermore, the manuscript is organized and follows a logical pattern. Section 2 presents the data sources, study design, labeling process, model training, and model performance metrics. Section 3 presents the outcomes of the model training. In Section 4, a comparison between this study and existing approaches in the literature is presented, along with an exploration of its limitations. Finally, Section 5 concludes by suggesting potential future applications of the model.

2. Material and Methods

2.1. Study Design and Data Sources

The present study utilized the cone-beam computed tomography scans obtained from the esteemed radiography archive of the Inonu University of Dentomaxillofacial Radiology Department. This archive comprises a cohort of patients exhibiting permanent or mixed dentition. Also, these cohorts of patients have a sought dental care for various conditions, including impacted supernumerary teeth, temporomandibular joint disorders, and pathological anomalies such as cysts or tumors. To be included in the study, the scans with mesiodens and without missing incisors or canines were included.

These data were scanned for the diagnosis of mesiodens, and images in which one or more mesiodens were detected in patients without any incisor or canine missing teeth in the maxillary anterior region were included in the study. In the study, 106 patients’ CBCT images were used, and gender differences were not taken into account. The NewTom 5G CBCT device (Verona, Italy) was used to obtain CBCT images with the following parameters: 110 kVp, 1–11 mA, 3.6 s, and 12 × 8 and 15 × 12 cm field of view (FOV). All images were recorded with an axial slice thickness of 0.2 mm and isotropic voxels. The study protocol was approved by the Inonu University Noninterventional Clinical Research Ethics Committee (decision date and number: 2022/3424). The study was conducted according to the principles of the Declaration of Helsinki.

CBCT images were saved as digital imaging and communication in medicine (DICOM) files. DICOM files were converted to axial section frame images in JPEG (Joint Photographic Experts Group) format using open-source version 3.8 ITK-SNAP software (https://www.itksnap.org). Images containing artifacts that prevent mesiodens segmentation (metal artifacts, artifacts caused by position errors during acquisition) were excluded. Resultant 851 axial images were used in the study.

2.2. Ground Truth

The project was created by uploading 851 axial images to CranioCatch labeling software (Eskişehir, Turkey), and in the second step, two dentomaxillofacial radiologists (10 years of experience (S.B.D.) and 2 years of experience (D.C.O.)) used axial slices of CBCT images, with a consensus each label, and polygonal box method was used to label the mesiodens (completely formed teeth). Labeling was carried out separately for each mesiodens in images containing multiple mesiodens. Labeling was carried out using images from the first portion, where the mesiodens crown was examined, and the last section, where the root was observed.

2.3. Deep-Learning Architecture and Model Development
2.3.1. Preprocessing Steps

Anonymized 851 axial slices of CBCT images were resized to 512 × 512, to increase image visual quality. This study applied image enhancement techniques such as intensity normalization and contrast-limited adaptive histogram equalization (CLAHE). The dataset was divided into three main groups: training, validation, and testing groups. Images were randomly assigned to the train, validation, and test groups. The dataset consisted of a total of 710 images and 844 labels. Here, 844 labels represented the total number of mesiodens labels in the images. The training set represented the piece of data on which a model is trained. The training set contains the input data and the corresponding labels used in the learning process of the model. The training data consist of a total of 569 images and 679 labels. Validation data consisted of 71 images and 81 labels. Test data consist of 70 images and 84 labels. Here, the dataset is divided into 80% training, 10% validation, and 10% test data (Figure 1).

2.3.2. Convolutional Neural Network (CNN) Model and Training

Deep learning was carried out with transfer learning techniques supported by the TensorFlow 1 library using the Python open-source programming language (v.3.6.1; Python Software Foundation, Wilmington, DE, USA). The mask R-CNN ResNet-101 model implemented with TensorFlow 1 was used to create mesiodens segmentation models with 300 epochs and 0.001 learning rate on axial slices of CBCT images in this study (Figure 2). The training process was handled using the Dell PowerEdge T640 Calculation Server (Dell Inc., Texas, USA), Dell PowerEdge T640 GPU Calculation Server (Dell Inc., Texas, USA), and Dell PowerEdge R540 Storage Server (Dell Inc., Texas, USA). Eskisehir University Faculty of Dentistry Dental-AI Laboratory computer equipment was used. Details of equipment specifications are provided in the appendix.

2.4. Performance Evaluation of AI Model and Model Performance Metrics

A confusion matrix was used to evaluate model performances. Together with the confusion matrix, precision-recall (PR) and receiver operating characteristic (ROC) curves were developed, and the area under the curve (AUC) was calculated.

The metrics used to evaluate the performance of the mesiodens segmentation model are true positive (TP): accurate detection and segmentation of the mesiodens, false positive (FP): the number of mesiodens was not segmented, and false negative (FN): the number of wrongly segmented mesiodens.

The performance metrics of the model were determined according to the formulas using the TP, FP, and FN numbers. Sensitivity, true positive rate (TPR): TP/(TP + FN), precision, positive predictive value (PPV): TP/(TP + FP), and F1 score: 2TP/(2TP + FP + FN).

3. Results

The AI model based on developed deep learning models yielded successful results for mesiodens segmentation in CBCT images. TP, FP, and FN numbers for mesiodens segmentation of AI models are 88%, 90%, and 0%, respectively. The sensitivity, precision, and F1 scores calculated from these values were found as 1, 0.9072, and 0.9513 (Table 1). It was evaluated that the TPR ratio was close to 1 (close to the upper left corner) in the ROC graph of the model and the PR curve (PRC) was close to the upper right corner, and the AUC ratio was determined as 0.9147 (Figures 3 and 4).

4. Discussion

Our Al CNN model-based study showed mesiodens segmentation of 88%, 90%, and 0%. Our study proposed that a mask R-CNN-based model can be successfully used for the automated detection of mesiodens. In this study, CBCT images were used while creating models that can automatically detect mesiodenses with a deep learning system. CNNs, which are made up of a series of machine learning algorithms, have numerous processing layers and can learn high-level information from low-level features, thanks to automatic data flow between them [11]. These multilayer convolutional neural networks utilize local connections, pooling, shared weights, and preferred analysis and are, therefore, often used especially in imaging diagnostics [12]. Artificial intelligence-based studies have been active in several medical sectors in recent years, thanks to the advent of deep learning algorithms. Although the number of research in the literature using artificial intelligence systems in dentistry is growing by the day, the use of deep learning in dentistry is still in its early stages. This study proposes that a mask R-CNN-based model can be successfully used for the automated detection of mesiodens.

In the field of dental radiology, artificial intelligence-based studies have mostly focused on tooth detection, identification, and diagnosis [9, 10, 12, 13]. The object identification process is the process of determining the area occupied by an object on a picture, as well as the region of the image and its boundaries [14]. Many methods such as R-CNN, fast R-CNN, faster R-CNN, and mask R-CNN are used for object recognition and detection [15]. Chui et al. [16] proposed mask R-CNN-based ToothNet for tooth segmentation and classification in CBCT. Jang et al. [17], in a study employing 3D mask R-CNN to perform 3D tooth segmentation, discovered that the accuracy rate for classifying tooth types was 93.35%, and the tooth segmentation was 95.97%. Miki et al. [18] performed an automatic tooth classification study on CBCT images using a DCNN-based artificial intelligence system and AlexNet architecture. They improved the training data by rotating and intensifying images, and as a result, they discovered that the average classification accuracy was 88.8%. These algorithms have been employed in research for orthodontic diagnosis, root canal treatments, tooth extraction, and lesion diagnosis [1921].

In this study, CBCT images were used while creating models that can automatically detect mesiodens with a deep learning system. There are studies evaluating the three-dimensional detection of mesiodens using CBCT and their relationship with the surrounding tissues. However, to the best of our knowledge, there are no studies examining the performance of deep learning models in the evaluation of mesiodens on CBCT images. Göksel et al. [22] found the prevalence of mesiodens to be 5.04% in their scan on 5000 CBCT images; they detected one mesiodens in 76.2% of this one, two in 18.8%, and three in 4.9% of them. Ryu et al. [23] used CBCT to analyze the position, localization, and association of 602 mesiodens with surrounding tissues in 452 participants in a research. In a similar study, the position, eruption direction, and formation of 81 mesiodens teeth were evaluated on 62 CBCT images, and 19.4% of the surrounding teeth had diastema, 17.7% displacement, 12.9% delayed eruption or impaction. Complications such as rotation and dentigerous cysts occurred in 1.7% of the mesiodens in 4.8%, while no change was observed in 43.5% of them [24].

The radiological and clinical importance of mesiodens makes it necessary to automate the detection of teeth. Kuwada et al. [25] developed a method for detecting impacted supernumerary impacted supernumerary teeth (IST) in the maxillary anterior region. With 550 panoramic radiographs, 275 of which include IST, a detection model was created utilizing AlexNet, VGG-16, and DetectNet architectures. While AlexNet and VGG-16 architectures used in this study have only object classification functions, DetectNet architecture can perform both object classification and object detection functions. As a result, DetectNet showed the highest diagnostic performance with AUC values of 0.93 and 0.96 on two different test sets. However, their study was limited to permanent dentition. Generally, it is much more difficult to detect an impacted mesiodens in mixed dentition than in permanent dentition. In this study, mesiodens was detected in mixed dentition and permanent dentition, and the AUC value of the model was 0.91.

In another study based on three CNN models (AlexNet, VGG16-TL, and InceptionV3-TL) developed for the detection of supernumerary teeth in the premaxillary region on panoramic radiographs, only individuals with mixed dentition were included. It was concluded that the diagnostic performances of AlexNet, VGG16-TL, and InceptionV3-TL models reached high sensitivity values such as 82.5%, 85.0%, and 83.3%, respectively [26]. They also stated that the sensitivity parameter is the most important criterion to avoid false negative diagnosis. The model used in this study achieved a high sensitivity value of over 90%.

Kim et al. [27] developed a two-network model for automatic mesiodens detection on panoramic radiographs. The first network (DeeplabV3plus) is a segmentation model that uses the posterior molar space to set the region of interest (ROI) in the maxillary anterior region with mesiodens in the panoramic radiograph, and the second network (Inception-ResNet-v2) is a classification model that uses clipped maxillary anterior teeth to identify mesiodens [27]. The average accuracy, precision, recall, F1 score, and AUC of the automatically segmented model were 0.971, 0.971, 0.971, 0.971, and 0.971, respectively. The deep learning model used in our study was single stage, and no separate regions were defined on the CBCT axial slices used. Despite this, the sensitivity, precision, F1 score, and AUC values of the model were 1, 0.907, 0.951, and 0.914, respectively.

Ahn et al. [28], who included primary and mixed dentition groups in their study, created deep learning models based on SqueezeNet, ResNet-18, ResNet-101, and Inception-ResNet-V2 architectures for the detection of mesiodens on 1100 panoramic radiographs, which were used as the control group, with 550 mesiodens. ResNet-101 and Inception-ResNet-V2 are reported to exhibit accuracy, precision, recall, and F1 score above 90% and SqueezeNet comparatively lower. Similarly, ResNet-101 was used as a transfer learning method in this study. This architecture consists of residual blocks, and even if no value can be obtained in the relevant layer thanks to these blocks, an effective result is obtained by using residual values from the previous layer [29].

Jeon et al. [30] found the accuracy values of YOLOv3, RetinaNet, and EfficientDet-D3 models to be 97.5%, 98.3%, and 99.2%, respectively. This is the first study of mesiodens on periapical radiographs, and EfficientDet-D3, which was used for the first time for dental radiographs, was reported to have the best accuracy among the three AI models.

A recent study proposed a fully automatic and one-step algorithm based on a deep mesiodens localization network (DMLnet) to simultaneously identify and localize mesiodens in panoramic images [31]. This model does not involve any manual marking during the testing phase, which can be considered as a different approach from other mesiodens detection methods. For the model training used in this study, labeling was performed on the images, but it is predicted that this did not affect the model performance. Since this study is the first study to detect mesiodens with CBCT images, it can be considered as a step for future fully automated studies.

One of the artificial intelligence studies for the diagnosis of mesiodens is Ha et al. [32] which includes all dentition groups; 96.2% accuracy of the internal test dataset, for which the YOLOv3-based CNN model was developed, using internal and external multicenter data for mesiodens detection, is the study in which the accuracy of the external test dataset was determined as 89.8%. Via YOLO, data are divided into potential bounding boxes from which convolutional features are extracted. The name “You Only Look Once” is derived from the use of a one-stage network architecture to estimate class probabilities and surround them with bounding boxes, without the need for a separate stage to define the region as in R-CNN architectures [33]. Mask R-CNN used in this study is based on faster R-CNN and is the most advanced model of regional object detection algorithms. The difference from other methods is that it determines all the pixels where the objects are located and creates a high-quality segmentation mask for each sample [34]. The current study using CBCT images has provided new insights into the detection of mesiodenses with AI models on panoramic radiographs, which are already available in the literature.

Our study has several limitations. The data included in the study are images obtained from a single device. More effective artificial intelligence models can be developed by incorporating images obtained under different devices and different imaging procedures. Furthermore, this study only examined supernumerary teeth in the maxillary anterior region, and supernumerary teeth in other parts of the oral cavity were not observed. The developed artificial intelligence-based approach is solely for detecting mesiodens, and mesiodens is not categorized based on their morphologies. Future studies could focus on expanding the model to classify mesiodens based on their morphologies and assess their impact on surrounding teeth and anatomical structures. We believe that our study will contribute to new artificial intelligence models that are based on their shapes, in addition to detecting mesiodens, and that can identify changes induced by the mesiodens in the surrounding teeth and anatomical regions.

5. Conclusion

Our study successfully developed a CNN-based model for the automated detection of mesiodens in CBCT images. The model demonstrated high performance in mesiodens segmentation, achieving accurate detection and segmentation. The use of deep learning techniques, specifically the mask R-CNN architecture, allowed for the precise identification of mesiodens, thus potentially leading to timely intervention and appropriate treatment planning.

Appendix

Eskisehir Osmangazi University Faculty of Dentistry Dental-Artificial Intelligence (AI) Laboratory has advanced technology computer equipment’s including Dell PowerEdge T640 Calculation Server (Intel Xeon Gold 5218 2.3G, 16C/32T, 10.4GT/s, 22M Cache, Turbo, HT (125 W) DDR4-2666, 32 GB RDIMM, 3200MT/s, Dual Rank, PERC H330+ RAID Controller, 480 GB SSD SATA Read Intensive 6 Gbps 512 2.5 in Hot-plug AG Drive), PowerEdge T640 GPU Calculation Server (Intel Xeon Gold 5218 2.3G, 16C/32T, 10.4GT/s, 22M Cache, Turbo, HT (125 W) DDR4-2666 2, 32 GB RDIMM, 3200MT/s, Dual Rank, PERC H330+ RAID Controller, 480 GB SSD SATA Read Intensive 6 Gbps 512 2.5in Hot-plug AG Drive, NVIDIA Tesla V100 16G Passive GPU), PowerEdge R540 Storage Server (Intel Xeon Silver 4208 2.1G, 8C/16T, 9.6GT/s, 11M Cache, Turbo, HT (85 W) DDR4-2400, 16 GB RDIMM, 3200MT/s, Dual Rank, PERC H730P + RAID Controller, 2Gb NV Cache, Adapter, Low Profile, 8 TB 7.2K RPM SATA 6 Gbps 512e 3.5in Hot-plug Hard Drive, 240 GB SSD SATA Mixed Use 6 Gbps 512e 2.5in Hot plug, 3.5in HYB CARR S4610 Drive), Precision 3640 Tower CTO BASE workstation (Intel(R) Xeon(R) W-1250P (6 Core, 12M cache, base 4.1 GHz, up to 4.8 GHz) DDR4-2666, 64 GB DDR4 (4 × 16 GB) 2666 MHz UDIMM ECC Memory, 256 GB SSD SATA, Nvidia Quadro P620, 2 GB), Dell EMC Network Switch (N1148T-ON, L2, 48 ports RJ45 1GbE, 4 ports SFP+ 10GbE, Stacking).

Abbreviations

AI:Artificial intelligence
AUC:Area under the curve
CAD:Computer-aided systems
CBCT:Cone-beam computed tomography
CLAHE:Contrast limited adaptive histogram equalization
CNN:Convolutional neural network
DICOM:Digital imaging and communication in medicine
DL:Deep learning
FN:False negative
FOV:Field of view
FP:False positive
FPR:False positive rate
IST:Impacted supernumerary teeth
PPV:Positive predictive value
PRC:Precision-recall curve
ROC:Receiver operating characteristic
TP:True positive
TPR:True positive rate.

Data Availability

The data supporting the current study are available from the corresponding author upon request.

Ethical Approval

All procedures performed in studies involving human participants followed the ethical standards of the institutional and/or national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. The study protocol was approved by Inonu University Noninterventional Clinical Research Ethics Committee (decision date and number: 2022/3424).

Additional informed consent was obtained from all individual participants included in the study.

Conflicts of Interest

The authors declare that they have no conflicts of interest.

Authors’ Contributions

D.C.O., S.B.D., and S.D. performed data collection, A.Z.S., I.S.B., and S.B.D. designed the study, O.C. developed AI algorithm and evaluation of results, A.A. and D.C.O. revised it critically for important intellectual content, and A.Z.S. and K.O. gave the final approval of the manuscript, English proofreading, and corrections.

Acknowledgments

This work has been supported by Eskisehir Osmangazi University Scientific Research Projects Coordination Unit under Grant no. 202045E06.