Objective treatment outcome assessment of a completely customized lingual appliance: A retrospective study

Objective > To assess the outcome quality of subjects treated with a completely customized lingual appliance (CCLA) in a postgraduate university program, using the ABO Objective Grading System (OGS), by testing the null-hypothesis of a significant proportion of post-treatment cases exceeding an adjusted ‚exam failure‘ threshold value of OGS = 24. Materials and Methods > This retrospective single-arm study included 66 consecutively debonded CCLA cases (m/f 19/47; mean age: 25.1 ± 9 years) treated at Hannover Medical School (MHH, Hannover, Germany). The discrepancy index (DI) was assessed on initial plaster casts. The OGS of the cast-radiograph evaluation was scored for both set-up and post-treatment casts, including the seven components of alignment/rotation, marginal ridges, buccolingual inclination, overjet, occlusal contacts, occlusal relationships and interproximal contacts, to parameterize differences between those. Results > DI score distribution ( ≥ 20, < 20) was 25 (37.9%)/41 (62.1%) subjects. Mean initial DI was 17.3 ± 8.5. Mean set-up OGS was 10.4 ± 4.4 (min-max: 3–21), mean final OGS was 17.7 ± 5.9 (min-max: 7–33), and the difference 7.3 (post-treatment - set-up) was statistically significant ( р < 0.0001; 95% CI [5.8, 8.7]). The null-hypothesis was rejected: A statistically significant proportion of the final casts ( n = 58; 87.8%) scored below OGS = 24 by exact binomial test ( P < 0.0001; 95% CI [77.5%, 94.6%]). The rate of a final OGS score < 24 was not significantly different ( P = 0.98) between both DI ( ≥ 20, < 20) groups. Conclusions > The outcome quality of the CCLA treatment in this postgraduate university setting was high and therefore sufficient for a vast majority of treated cases to pass the ABO-OGS clinical examination.


Introduction
Beyond the apparent advantages of lingual orthodontic treatment over labial fixed appliances in terms of aesthetics and reduction of white-spot lesion formation [1], a recent systematic review has reported an adequate effectiveness of lingual orthodontic treatment in achieving individual, pre-set treatment objectives, along with a pronounced reduction of the well-known side-effect of post-orthodontic decalcifications [2].A majority of lingual appliances used today are completely customized lingual appliances (CCLA) [2].The CCLA fabrication process is based on an individual target occlusion set-up model made by specialized dental technicians.Occlusal treatment objectives are mostly orientated to the ideal occlusion concept [3], but may individually deviate from the six-keys concept in order to meet individual case requirements desired by the orthodontist or patient.The target set-up marks both the treatment objective, and forms the basis for the CCLA bracket and archwire fabrication adjustment.Previous studies have compared the accuracy of CCLA outcomes with the respective individual target set-up, and reported a high accuracy in achieving those pre-set treatment objectives [4,5], such as an average in/out deviation of less than 0.2 mm between set-up and final [6].The constantly growing demand for aesthetic orthodontic correction is also evident by the increasing popularity of aligner treatments, but it has simultaneously raised questions about the quality of treatment results achieved by these treatments.A study by Buschang et al. indicated that the treatment outcomes of half of the aligner patients did not seem to fully compass the standards set by the American Board of Orthodontics (ABO) objective grading system (OGS) [7].Based on its validity and reliability of the systematic, the ABO-OGS orthodontic cast and radiographic evaluation is widely regarded as a current gold standard for assessing the quality of orthodontic treatment outcomes [8][9][10][11][12][13][14].Likewise, Buschang et al. [7], Patterson et al. recently gave cause to justified doubts about the ability of aligners in accomplishing treatment objectives set by the digital ClinCheck software in Angle-Class II cases, as they reported the surprising inability of aligners in achieving ABO-OGS standards in presence of an Angle-Class II malocclusion [15].There is a lack of objective data on how the clinical outcome quality of CCLA treatments meet the objective standard of the ABO-OGS to date.Mujagic et al. provided data on CCLA treatment outcomes in Angle-Class II subjects treated in combination with a Herbst appliance [16].A recent retrospective study compared lingual with labial treatment outcomes; however, the investigated sample was flawed by not being based on consecutive patients, a drop-out of more than 20% due to missing documents or defective records and a lack of comparison with the set-up prediction [17].

Objective
The purpose of this study was to analyse differences between the CCLA treatment outcome and both the predicted target occlusion by the respective individual set-up, and the standards set by the ABO-OGS.In addition, we tested the nullhypothesis of a significant proportion of post-treatment cases (H0: proportion = 50%) exceeding an adjusted threshold value of OGS = 24, i.e., that the true chance of achieving a final OGS score < 24 would be 50%.

Subjects
Orthopantomograms, initial and post-treatment plaster casts and set-up models of subjects treated with a CCLA by orthodontists at the Department of Orthodontics, of the Hannover Medical School (MHH; Hannover, Germany) as part of the local program Master of Science in lingual orthodontics were consecutively included in this retrospective study according to the following inclusion-and exclusion criteria.No restrictions were made in terms of the quality or severity of the initial malocclusion.Inclusion criteria: • initiation and completion of CCLA (WIN, DW Lingualsystems, Bad Essen, Germany) treatment within the lingual postgraduate education program of the Hannover Medical School, Department of Orthodontics (MHH; Hannover, Germany) from February 2012 until September 2018.Exclusion criteria: • cases with substantially compromised treatment plans (i.e., set-up models not requested to meet the six keys to ideal occlusion; i.e., not set to perfect Angle-Class I canine and/or molar occlusion and/or without ideal overbite and overjet); Drop-out = 3; • treatment stopped ahead of schedule by patient or orthodontist; Drop-out = 0.No patient was excluded for any other reason, e.g., missing records, or insufficient oral hygiene, or lack of compliance.No restrictions were made in terms of the extent or severity of the initial malocclusion, or additional measures taken such as orthognathic surgery, or use of a Herbst appliance in addition to CCLA treatment.Of a total of 69 consecutively debonded potentially eligible patients (m/f 19/50), three female cases were excluded due to a substantially compromised treatment plan.No patient received orthognathic surgery; eight cases were treated with a Herbst appliance and two with a Forsus spring (Fatigue Resistent Device; 3M Unitek Corp, Monrovia, CA, USA) in addition to CCLA treatment.The final sample size was 66 (m/f 19/47; initial mean age 25.1 ± 9 years).

Methods
In order to systematically document the pre-treatment severity and complexity of the initial malocclusion, the discrepancy index (DI) was assessed for each patient, using a detailed working sheet provided by the ABO [18].The studied items included alignment/rotation, marginal ridges, bucco-lingual inclination, overjet, occlusal contacts, occlusal relationship and interproximal contacts.Two subgroups were formed to separately analyse the impact of the treatment difficulty levels according to the ABO clinical examination guidelines: Group DI < 20 (mild to moderate level), and group DI ≥ 20 (moderate to difficult level).The quality of the post-treatment occlusion was judged on the basis of a point deduction of an adjusted OGS of the castradiograph evaluation (CRE) section as specified by the ABO and compared with that of the set-up models [8][9][10][11][12][13][14].Since the set-up models had no panoramic x-rays to evaluate the root angulation, only the OGS portion of the CRE was analysed.Therefore, the regular OGS passing score of 27 was necessarily reduced to a score of 24 in this study, taking into consideration an average potential score deduction of 3 points for root angulation by C-R evaluation [18].All measurements for scoring both the DI, and the seven OGS components of alignment/rotation, marginal ridges, buccolingual inclination, overjet, occlusal contacts, occlusal relationships and interproximal contacts were performed by one trained examiner (FMQ) who is a diplomate of the ABO, using the ABO Measuring Gauge (resolution: 0.5 mm).

Method error
To assess the accuracy and reproducibility of the DI and OGS scorings, the initial, set-up and post-treatment casts of ten arbitrarily selected subjects were re-assessed one month later by the same examiner.The intra-examiner reliability for the DI score, OGS-set-up model, and final model were 0.998, 0.927, and 0.986, respectively, yielding a high level of reproducibility of DI and OGS assessments.

Statistics
Measurement data derived by this single arm trial were analysed descriptively (frequencies and percentage values, mean ± standard deviation (SD), minimum and maximum values (min-max)) for age, duration of treatment, DI, and OGS.Primary endpoints were the OGS for the set-up and posttreatment models.They were derived as the sum of the seven components measured.An exact binomial test was used to analyse the null-hypothesis that the true chance of achieving a final OGS score < 24 would be 50% (H0: proportion = 0.5).The potential impact of different factors on the OGS difference (post-treatment -set-up) were assessed using a multifactorial ANOVA including DI score (DI ≥ 20, DI < 20) and gender (m, f) as fixed effects, and age at treatment start (in years), duration of treatment (in months) and OGS of the set-up as random effects.The correlation between two variables was assessed using the Pearson correlation coefficient.The impact of the DI score on the final OGS < 24 was additionally analysed by two subgroups (DI ≥ 20, DI < 20).The results were summarized via a frequency table and analysed using a Chisquare test.OGS scores differences (set-up vs. post-treatment) were assessed in all subjects and subgroups by paired t-tests and reported using 95% CI and P-values.Significance-level was set to α = 0.05.No alpha correction was performed.All statistical analyses were performed with SAS v9.4 (SAS Institute, Cary, NC, USA).

Results
The characteristics of the subjects and the treatment are given by tables Ia and Ib, separately for the two malocclusions (DI) level groups.Twenty-five subjects (37.9%) had a DI score ≥ 20, and 41 (62.1%) had a score < 20.Eight out of 66 patients (12.1%) had a final OGS score ≥ 24.Fifty-eight patients (87.8%) had a post-treatment OGS < 24, and this proportion was statistically significantly different from the null-hypothesis that the rate would be 50% (P-value < 0.0001, 95% CI [77.5%, 94.6%]).The rate of a post-treatment OGS < 24 was similar in both DI groups (88.0% vs. 87.8%for DI ≥ 20 vs. DI < 20).Therefore, it was not possible to detect a significant difference by Chi-Square test (P = 0.98) between the DI subgroups (table II).The paired t-test revealed a mean OGS difference between set-up and post-treatment of 7.3 ± 5.9 and 95% CI [5.8, 8.7] that was statistically significant (P < 0.0001).Figures 1 and 2 and table III depict the proportions and contributions of the seven single OGS items.

Interproximal Contacts
In the multivariate ANOVA, there was no statistically significant effect on the OGS difference (post-treatment-set-up) for most of the included variables, with the exception of treatment duration (in months) and set-up OGS.Longer treatment duration seemed to be associated with larger final OGS scores (table IV).Table Ib depicts that the different DI groups result in different mean treatment durations due to the complexity of the DI ≥ 20 cases.In addition, the set-up OGS is statistically significant correlated with the final OGS as well as the OGS difference (P < 0.002).If split into the different DI groups, the set-up OGS was only significantly correlated with the final OGS for cases with a DI < 20.
The impact of the DI on the OGS difference was explored by separate paired t-tests per DI subgroup.The results were comparable, with mean differences of 7.0 vs. 7.8 (DI < 20 vs. DI ≥ 20, Table 1b) and the difference was statistically significant in both subgroups (P-values < 0.001).

Discussion
The ABO-OGS constitutes the current gold standard for objectively assessing clinical orthodontic treatment results, and previous studies have reported an adequate accuracy and reliability for both clinical purposes and university or ABO exams [8][9][10][11][12][13][14].Accordingly, the repeated measurement analysis in our study yielded an acceptable intra-rater variability.

Initial DI
The average initial DI score in our sample was 17.26 ± 8.49, indicating a moderate to high treatment difficulty level according to the ABO clinical guidelines.Separate analysis of the sample based on the DI severity ( ≥ 20, < 20) yielded an ABO clinical examination passing rate of about 88% for both groups (table II).Also, the mean OGS score for both groups separately analysed was below 24, each.Accordingly, the severity of the initial malocclusion did not seem to impair the clinical performance of the CCLA in terms of the outcome quality.This result is in agreement with previous reports on   how CCLA treatment outcomes compare with their respective set-up [4-6, 19,20].However, the initial DI had an impact on mean treatment duration, which was shorter in group DI < 20 (21.1months) compared to 30.7months in group DI ≥ 20.

Null-hypothesis, OGS results
The set-up models that provided the basis for the CCLA treatments were as good as possible adjusted to Andrews' six-keys to ideal occlusion concept [3].Accordingly, the worst set-up OGS score assessed by this study was 21, with a mean of 10.33 ± 4.38, which indicates that particularly in adult treatment, the complete achievement of the six keys of Andrews is not always a realistic treatment goal.
In our study, eight out of the 66 post-treatment models had an OGS score exceeding the threshold value of 24.This would translate into an ABO passing of 87.87% of the cases, with the passing score set to 24.Separate analysis of the subgroups (DI ≥ 20; DI < 20) yielded that 3 of 22 (DI ≥ 20, 12.0%) or 5 of 36 (DI < 20, 12.2%) cases scored above 24 OGS-points.Thus, the null-hypothesis of a significant proportion of post-treatment cases exceeding an adjusted threshold value of OGS = 24 was rejected (P < 0.0001).It indicates that significantly more than 50% of the patients reached an outcome that would suffice to pass the ABO exam.
We defined the relevant percentage to be 50% since we considered this proportion to be clinically relevant.The highquality laboratory set-up procedure during CCLA fabrication along with pre-bend precision archwires is assumed to have contributed to this high proportion of potentially passed ABO exams.Also, the routine use of a transfer tray for precise indirect bracket bonding may be considered as a relevant factor in terms of treatment quality [21].Furthermore, particularly in a university program, the postgraduate clinicians are supervised by an expert in lingual treatment and motivated to follow the recommended treatment protocol, the latter of which is crucial for the quality of lingual treatment outcome.Based on a comparison of OGS outcomes of various studies reported by Buschang et al. and to the best of our knowledge, the assessed OGS of 17.7 ± 5.9 for post-treatment casts in this study seems to be the lowest reported OGS score for a sample of consecutively treated cases in a university setting in the literature [6].Despite a statistically significant overall OGS difference (set-up vs. posttreatment), median OGS scores of both groups undercut the ABO clinical passing score distinctively with a final median OGS value of 17 (table I).Clinically, we therefore considered the set-up/post-treatment differences only marginally relevant.In detail, separate descriptive analysis of the seven individual components of the ABO-OGS yielded the result that major contributions to the OGS point deductions of set-up models were nearly uniform of approximately 20% each between alignment/rotation, buccolingual angulation and marginal ridges (figure 1).In comparison, alignment/rotation (25%) and bucco-lingual inclination (26%), followed by adjusting marginal ridges contributed to most of the point deductions of the post-treatment OGS, while adjusting interproximal contacts contributed the least (figure 1 and table II).
It is noteworthy that whilst adjustment of marginal ridges had approximately the same impact on OGS deductions of set-up and final casts, alignment/rotation and bucco-lingual inclinations roughly doubled their impact, followed by overjet correction and occlusal relationships (table III).Interproximal contacts had a minor impact on absolute score deduction (figure 1).

OGS in other studies/appliances
Buschang et al. [7] evaluated 27 consecutive cases treated with aligner therapy utilizing the OGS.They did not report the DI for their sample, however, due to the nature of the aligner technique it may be assumed that the treatment difficulty or initial malocclusion severity of those cases has been minor to moderate.The median OGS point deduction for the ‚ClinCheck' (equivalent to a digital set-up model) and posttreatment models were 14 (min-max 5-32) and 24 (min-max 12-43) respectively [7] compared to a median 9.  [24].
Other than the presence of the individual target set-up [4,5] and the use of transfer trays for indirect CCLA bracket bonding, higher potential ABO-OGS exam failure rates in these studies compared to our results may also be reasoned by the slot/ wire interplay of fully customized vs. molded brackets, the latter of which have been reported to be of up to 24% overdimensioned when compared to the manufacture's indication [25].Previous analyses of superimpositions of treatment results produced by a CCLA with the respective set-up yielded a high predictability of CCLA results, such as deviations in rotations of 2deg, and in translations below 0.2 mm for incisors [6].Likewise, third-order control or torquing capacity of CCLAs with a full-size β-titanium archwire (0.018″ x 0.018″ and 0.018″ x 0.025″) has been measured previously to have a tolerance of 0-2deg, only [19].

Limitation
The design of this study was retrospective, and although it allows for a comparison with various similarly designed studies [7,9,22,23], there was no direct comparator within this study.Despite our study sample was composed without an exclusion of any patient due to missing records, insufficient oral hygiene, lack of compliance or the severity of the initial malocclusion, this needs to be taken into account in terms of generalisability.

Conclusion
The quality of both the set-up and the post-treatment result achieved by the tested CCLA is high.About 88% of our sample of cases treated consecutively as part of a university program would have passed the ABO-OGS clinical examination, regardless of the initial severity of the malocclusion.
Disclosure of interest: the authors declare that they have no competing interest.