Sensitivity and Specificity of Cephalometric Measures for the Diagnosis of Sagittal Skeletal Malocclusion

Objective: To evaluate and compare sensitivity and specificity of ANB, Wits, APDI and AF-BF to diagnose sagittal skeletal malocclusions, in children between 6 to 12 years old, using ROC curves, a widely accepted method for the analysis and evaluation of diagnostic tests. Material and Methods: A descriptive-comparative study of diagnostic tests was conducted. From a population of 3,000 children, a non-probabilistic sample of 209 was selected. The clinical classification of the patients as class I, II or III, made by a group of experts based on the visual inspection of models and photographs, was chosen as the gold standard. After calibration (ICC>0.94) the variables were measured in cephalograms. Eight ROC curves were plotted (I vs II, and I vs III for each one of the variables). The area under the curve was measured and compared (Ji-square test). Cut points were established. Results: To discriminate Class I from II, ANB showed the largest area under the curve (AUC) (0.876) and the cut point (best sensitivity and specificity) was at 5.75°. To discriminate class I from III, Wits showed the largest AUC (0.874) with a cut point of -3.25 mm. There were no statistical differences between the AUC for the four variables (p=0.48 y p=0.38 for class I-II and I-III). Conclusion: ANB and Wits performed better for the diagnosis of class II and III, respectively. Cut points in children were different from those reported in adults.


Introduction
An accurate diagnosis and an effective, easy and precise classification of skeletal malocclusions are essential topics in orthodontics and maxillary orthopedics [1][2][3]. Many diagnostic tools have been used; cephalogram stands out, as it has been proven as a valuable technique for the evaluation of maxillomandibular sagittal discrepancy [4][5][6][7].
Some authors have proposed angular and lineal measures to classify skeletal sagittal relationship. The first one was the ANB angle (A point-Nasion-B point). It has been said that this angle is prone to variability for different factors, like the size and inclination of the plane formed by Sella-Nasion points, so the diagnosis based on this angle could be misleading [8][9][10][11][12][13]. Subsequently, other author suggested the Wits appraisal, which uses the occlusal plane as the reference plane [14,15]. It has been shown that this plane has a wide range of variation due to changes in the inclination of the plane [11,16]. To counteract this deficiency, another study introduced the AF-BF measure to determine the distance between perpendiculars drawn from point A and point B on the Frankfort horizontal plane [6]. Other authors proposed the anteroposterior dysplasia indicator (APDI) obtained from the facial angle plus/minus the A-B plane angle and, again, plus/minus the palatal plane angle [17].
Currently, the contradictory findings on the correlation of the different measures used to determine the intermaxillary sagittal relationship makes it difficult to recommend one over another; moreover, information about sensitivity and specificity of their values is scarce.
The receiver operator characteristic curves (ROC) are a widely accepted method for the analysis and evaluation of a diagnostic test [18][19][20][21]. They demonstrate the relationship between sensitivity (proportion of people with a disease and a positive result in a given test) and specificity (proportion of people without the disease and a negative result) in a test. Such curves can be used not only to decide the optimal cut point, which can be found in the slope of the graphic, but also to compare alternative tests for the same diagnosis [19].
Taking into account the limited information available in scientific papers regarding the evaluation of the accuracy of the different cephalometric measures for the sagittal diagnosis of malocclusions [22][23][24], and considering that the published results are contradictory, this study evaluates and compares the sensitivity and specificity of cephalometric measures ANB, Wits, APDI and AF-BF for the sagittal diagnosis of class I, II and III malocclusions in children 6-12 years old.

Study Design
A descriptive-comparative study of diagnostic tests was conducted in a population of 3,000 children aged 6-12 years, from the Center for Human Growth and Development of the Faculty of Dentistry of University of Antioquia where the study was done, referred for maxillary orthopedic treatment. Data gathering, processing and analysis were performed between July 2015 and February Sampling A convenience sample of 660 children was selected. The inclusion criteria were: full and good quality diagnostic records (lateral head films, cast models and clinical photographs) and informed consent signed by parents. Patients with visible facial asymmetry in extraoral photography, early primary molar loss, posterior mesial drift or vertical skeletal alterations as considered by the expert panel were excluded. All diagnostic records were taken at the same center by the same operator and with the same X-ray equipment, Veraviewepocs 2D (J. Morita USA Inc., Irvine, California, USA).

Data Collection
Clinical evaluation was based on a method previously described, that has shown a statistically significant correlation with the skeletal sagittal maxillomandibular relationship [22][23][24][25][26].
The evaluation consisted of a visual inspection of cast models and the inspection of frontal and lateral photographs made by a group of four expert orthodontists/orthopedists. Occlusal variables like molar, canine relationships and overjet were evaluated as well as the facial profile. Each expert independently rated patient's malocclusion twice, a month apart. Individuals whose sagittal classification coincided in these two evaluations were included and categorized as maloclussion I, II or III. In this way, was defined the gold standard. Final sample included only children who had 100% agreement in the experts' classifications (to avoid discordant bias in gold standard application). Only these individuals' cephalograms were traced.
Two hundred and nine radiographs were traced by a previously trained and calibrated operator. Cephalometric tracing was accomplished with Vistadent 2.1 Software (Dentsply Sirona, Ontario, Canada). Each image was standardized 1:1 with the software's millimetric ruler located on the nasion of X-ray machine. This ruler was visible in all radiographs. The four variables included ANB, Wits, APDI and AF-BF were measured in lateral head films, according to the author's original description [6,8,14,15,17].

Statistical Analysis
All data were analyzed using the statistical package IBM SPSS Statistics Software, version 23 (IBM Corp., Armonk, NY, USA) and Epidat, version 3.1 (OPS/OMS, Xunta de Galicia, Spain).
Intraclass correlation coefficient (ICC) was used to determine the reproducibility of cephalometric tracing landmarks. The distribution of the variables was assessed. To categorize malocclusions according to ANB, Wits, AF-BF and APDI measures, mean and standard deviation with a 95% confidence interval were used. The Kolmogorov-Smirnov test was used to evaluate the distribution of the variables and parametric tests were used.
To estimate the diagnostic performance of lineal and angular measures, 8 ROC curves were plotted (association between class I/I, and class I/III, for each one of the four cephalometric variables). With these curves, cut points were established to calculate sensitivity and specificity of each measure for each malocclusion. To compare AUC for every variable, a Chi-square test of homogeneity was done.
A supplementary Woolf test was done to determine the Diagnostic Odds Ratio (DOR) with a 95% confidence interval [27,28].

Ethical Aspects
The study was approved by the Ethics Committee of the Faculty of Dentistry, University of Antioquia (Minute number 1,2015). It was considered as minimal risk and accomplished with all national and international regulations for studies in human beings. At the beginning of the study, informed consent was obtained from parents or legal guardians authorizing confidential handling of radiographs and clinical data.  Table 2 shows the results of cephalometric variables for each one of the malocclusion groups as classified by the experts. Using a one-way analysis of variance (ANOVA), differences between cephalometric variables in the 3 malocclusion groups (p=0.000) were found. Later, a post-hoc Tukey test found statistically significant differences for variables (ANB, Wits, APDI and AF-BF) in all groups (p=0.000). After plotting ROC curves, cut points were defined. They allowed calculating sensitivity and specificity for each one of the four variables. Between class I/II, ANB showed the largest area under the curve (AUC) (0.876) and the cut point (greatest sensitivity and specificity) was at 5.75° between class I/III, Wits appraisal presented the greatest AUC (0.874) with a cut point of -3.25mm (Table 3).

Discussion
The receiver operating curves, or ROC curves, are a widely accepted method for the evaluation and comparison of performance of a test [18][19][20][21]. ROC curves have been used extensively in medicine; in dentistry, they are beginning to be used in radiology in areas like early caries detection, maxillary canine impaction and cephalometric diagnosis of skeletal patterns [22,23,[29][30][31].
When there is no established gold standard, the presence or absence of a disease (malocclusion in this case) can be determined using the consensus of a panel of experts [20]. This study used as the gold standard, similar to the protocol described previously [23]. In this study, patients ranged between 6-12 years, younger than the population of other studies [22,23]. It is important to know which cephalometric variable is more accurate to diagnose malocclusions at early ages.
Finding a difference between the three groups in cephalometric variables (Table 2) indicates that they were properly classified by the gold standard, and clinical inspection of occlusion and frontal and lateral photographs of the children are a valid diagnostic tool to detect skeletal sagittal alterations. These findings correspond with previous reports [22,23].
A previous study evaluates sensitivity and specificity of ANB, Wits and APDI for intermaxillar sagittal skeletal classification and found that APDI had the best diagnostic performance for the class II and class III patterns. They classify malocclusions according to molar displacement in cast models (gold standard) [22]. Afterwards, some authors, evaluated some cephalometric measures, including those used in the previous study, and concluded that the ANB angle had the best performance on ROC curves and Wits and APDI analysis showed some flaws [23]. They used as the gold standard a classification made by an expert panel, like the present study does.
These cited articles did not include the AF-BF variable. This measure was developed as an alternative to evaluate antero-posterior maxillomandibular relationship and counteracts the likely trouble arising from the inclination of skull base and occlusal plane and vertical displacement of A and B points. This is the absolute measure of the sagittal relationship of the jaws with respect to the Frankfort plane, considered to be anatomically stable [6]. Since there is no information about its diagnostic accuracy, it is important to include AF-BF in the present study.
For ROC curves interpretation, the test with the greatest AUC is considered to have the best diagnostic performance [18][19][20][21]. In our study, the greatest AUC to discriminate between class I/II was for ANB angle, while Wits appraisal performed better to discriminate class I/III. These findings agree in part with the description of a previous study, which revealed that Wits had the greatest AUC for class II and class III, and although ANB angle performed well, it was not as effective in classifying class II and III [23]. Furthermore, it was found that APDI showed a higher level of accuracy for the diagnosis of class II and III. This is different from the findings of our study, in which APDI performed well, but was not the best, for both types of malocclusions [22].
We also found different cut points. In our study, ANB angle was greater (5.75°) for the diagnosis of class II than what was previously reported (3.6°), suggesting a more posterior mandible position in younger children [23]. It has been reported that during puberty, the mandible experiences a pubertal growth spurt that can modify facial and sagittal skeletal relationship [11,16,32,33].
Besides, the cut point for Wits appraisal for class III diagnosis was -3.25 mm, lower than that found previously (-4 mm) [23]. The other variables also presented different cut points than previously reported, with values closer to class I. These findings suggest that class III diagnosis in small children should be thorough and comprehensive since skeletal alterations can be subtle. The chi square test of homogeneity showed no statistically significant differences among the analyzed variables in their diagnostic ability, similar to the findings of another study [22].
The DOR can be used to compare different diagnostic tests for the same diagnosis [27,28].
The greater DOR for class II was for ANB and for class III was for Wits. Although all tests correctly classified malocclusions, these were remarkable for their high diagnostic performance.
Our findings suggest that even though ANB angle and Wits appraisal had the best diagnostic performances for Class II and III, respectively, their use should not be recommended over other tests since there was no statistical significance in the difference among the four variables.
The clinical relevance of our findings lies in the simplification of diagnosis, clinical decisions and identification of subtle alterations. Sample size was a limitation of the study. We suggest a larger sample size for future studies and to evaluate ANB, Wits, APDI and AF-BF in patients with concurrent vertical alterations and compare them to our findings.

Conclusion
ANB angle presented the best diagnostic performance to classify class II skeletal relationship, and Wits appraisal for class III in patients aged 6-12 years. Cut points in children were different from those of adults. ANB, Wits, APDI and AF-BF correctly classified patients in the sagittal plane.