A Computer-aided Method for Improving the Reliability of Lenke Classification for Scoliosis

Classification of the spinal curve pattern is crucial for assessment and treatment of scoliosis. We developed a computer-aided system to improve the reliability of three components of the Lenke classification. The system semi-automatically measured the Cobb angles and identified the apical lumbar vertebra and its pedicles on digitized radiographs. The system then classified the curve type, lumbar modifier, and thoracic sagittal modifier of the Lenke classification based on the computerized measurements and identifications. The system was tested by five operators for 62 scoliotic cases. The kappa statistic was used to assess the reliability. With the aid of computer, the average intraand interobserver kappa values were improved to 0.89 and 0.81 for the curve type, to 0.83 and 0.81 for the lumbar modifier, and to 0.94 and 0.92 for the sagittal modifier of the Lenke classification, respectively, relative to the classification by two of the operators without the aid of computer. Results indicate that the computerized system can improve reliability for all three components of the Lenke classification.


INTRODUCTION
Scoliosis is a complex three-dimensional (3D) deformity of the spine. It is defined as a lateral curvature of the spine greater than 10° accompanied by vertebral rotation [1]. Classification of the spinal curve pattern plays an important role during preoperative surgical planning when selecting fusion levels. Currently, the Lenke [2] and King [3] classifications are widely used for that purpose. The Lenke classification, which presents a more global scheme than the King classification, includes three components: curve type, lumbar modifier, and thoracic sagittal modifier. The Lenke method can be described in a chart, as shown in Figure 1, courtesy of Lenke et al. [2]. * The Lenke classification relies on subjective identification and measurement of the radiographic features, including measurement of the Cobb angle in different spinal regions and identification of the apical lumbar vertebra (ALV). It also requires interpretation and memory of the classification criteria. Although Ogon et al. [4] claimed that the Lenke classification had overall better reliability than the King classification, Richards et al. [5] reported only fair reliability of the Lenke system.
Recently, some computer-aided methods were developed to measure scoliosis curves and to classify spinal deformity on radiographs. Phan et al. [6] presented a review of the reported algorithms during 2000-2009 aiming to assist the evaluation and treatment of adolescent idiopathic scoliosis. For the King classification, Stokes et al. [7,8,9] developed a computer program to calculate Cobb angles based on analysis of coordinates of vertebral landmarks on radiographs and then to identify the King types using a rule-based algorithm. Their approach required manual identification of numerous landmarks (70 per radiograph). The inherent operator-dependency of landmark identification might result in subsequent measurement variability.
Phan et al. [10] and Mezghani et al. [11] developed an assessment system for idiopathic scoliosis using the 3D features generated from the self-organizing maps to classify scoliotic cases from a large database. Poncet et al. [12] proposed a classification system that defined three patterns of scoliotic curves based on the spinal geometric torsion. These methods were different from the traditional way of measuring scoliotic curves (Cobb or Ferguson angle).
Phan et al. [13] and Mezghani et al. [14] translated the Lenke chart into a decision-rule tree for the curve type classification. They reported 93% accuracy of the curve type classification by the users using rule-based diagram, and 99% accuracy by the computer classifier. In their studies, the radiographic measurements from each case were presented on a spreadsheet instead of radiographs to avoid variability from these measurements. However, there is variability in these measurements, including that in the Cobb  Chart description of the Lenke classification criteria [2].
Journal of Healthcare Engineering · Vol. 6 · No. 2 · 2015 147 measurement which is the most important measurement in the Lenke classification. The Cobb angle can be manually measured as the angle between the lines drawn along the upper endplate of the superior end-vertebra and the lower endplate of the inferior endvertebra. Studies of intra-and interobserver variability of manual Cobb measurements have revealed a typical error of ±5°, which introduces variability in the classification. In our previous study [15], a computer-aided approach was proposed to reduce the variability in the Lenke-type classification by a computerized Cobb measurement. However, our previously published method only reduced the variability in the curve type classification, without considering the variability in identifying modifiers of the Lenke system. Duong et al. [16] developed a computer-aided method to improve reliability in determining lumbar modifiers using 3D spine models. Duong et al. [17] also used a wavelet transform of the vertebrae centroids and a fuzzy clustering algorithm to group 3D spine shapes. Lin [18] implemented an artificial neural network to automatically identify the King types based on features extracted from a simplified 3D spine model by a total curvature analysis. Lin et al. [19] presented a preliminary study for a computer-aided Lenke classification based on 3D spine models, but did not report its reliability. Using 3D spine models, Sangole et al. [20] and Stokes et al. [21] each performed clustering analysis and extracted clusters. These methods illustrated that 3D parameters are important. However, these methods did not integrate accepted clinical parameters and were not considered to be very intuitive by physicians. Therefore, their clinical use is limited [6].
In our previous study [22], a computerized approach that automatically measured vertebral inclination was developed to improve the reliability of the Cobb measurement on digitized radiographs. In the present study, we proposed a computer-aided Lenke classification approach using the computerized Cobb measurement to reduce the variability for all three components of the Lenke system.

MATERIALS AND METHODS
Posteroanterior (PA) and lateral (LAT) radiographs obtained from 62 patients (51 girls, 11 boys; 12.5 ± 3.3 years of age) with idiopathic scoliosis were used, who met the inclusion criteria: (1) diagnosis of idiopathic scoliosis; (2) ages 9-18 years; (3) no prior spine surgery; (4) Cobb angle <90°; (5) visibility of the pelvis and T1-L5 vertebral levels on radiographs. Patients who had other musculoskeletal or neurological disorders were excluded. Informed consent was obtained from all patients/parents. The institutional review board of Yunnan University approved the study. The main curves of patients included in this study had a mean Cobb angle of 41°± 17°measured by an orthopedic surgeon.
In our previous study [22], an approach based on the fuzzy Hough transform (FHT) was developed for Cobb measurements of a spinal curve. We adopted this approach in the present study to identify the inclination of each vertebral endplate for the automatic Lenke classification.

Measurement of the Curve Angle
In the Lenke system, there are four curve pattern locations along the spinal column: proximal thoracic (PT), main thoracic (MT), thoracolumbar (TL), and lumbar (L), as shown in Figure 1. The Cobb angles were measured in the coronal and sagittal planes for the following curves: (1) the PT, MT, and TL/L curves on the standing coronal radiograph; (2) the proximal thoracic (PTB), main thoracic (MTB), and thoracolumbar/lumbar (TLB) curves on the side-bending coronal radiograph; (3) the proximal thoracic (PTS), main thoracic (MTS), and thoracolumbar/lumbar (TLS) kyphosis curves on the sagittal radiograph.
To measure these curve angles, the operator should assign names to the highest and lowest thoracic vertebrae that appeared on both PA and LAT radiographs. Although in all cases used in this study, all thoracic vertebrae were visible on both PA and LAT radiographs, the user-assigned name of the highest thoracic vertebra would be used in the condition that some higher thoracic vertebrae were invisible on radiographs. The number of vertebrae in a region may vary from case to case, although overall the number is the same, i.e., 12 thoracic vertebrae and 5 lumbar vertebrae. Therefore, the user-assigned name of the lowest thoracic vertebra was used to differentiate between thoracic and lumbar vertebrae in the subsequent algorithms.
All digitized radiographs were normalized to a standard height of 1000 pixels. From the highest thoracic vertebra, the operator successively selected vertebrae by clicking the mouse on each vertebra on the PA and LAT radiographs, respectively. Once the operator clicked on a vertebra, an initial rectangle of 100 × 80 pixels was created and displayed. The operator could adjust the rectangle to fit the vertebra by using features such as magnification or minification, clockwise or counter clockwise rotation, and up, down, left, or right movements. As an example, Figure 2 shows the rectangles assigned on the radiographs. Each user-assigned rectangle defined a region of interest (ROI) for a vertebra. A vertebral ROI is shown in Figure 3. After rectangle selection, the Canny edge detector [23] was employed to obtain the required edge image for the Hough transform in each ROI. To delete noise and artifacts, an inner rectangle and an outer 148 A Computer-aided Method for Improving the Reliability of Lenke Classification for Scoliosis

Figure 2.
Interface of the computer-aided Lenke classification system. rectangle according to the user-assigned rectangle were automatically defined in the algorithm. The distance between the inner rectangle and the user-assigned rectangle was one-sixth of the height of the user-assigned rectangle, and the distance between the outer rectangle and the user-assigned rectangle was one-eighth of the height of the userassigned rectangle. The ROI was the region that just covered the outer rectangle. The noise and artifacts were then automatically deleted by excluding the edges inside the inner rectangle and outside the outer rectangle in the edge image of the ROI. For example, Figure 3 shows an ROI with the inner and outer rectangles ( Figure 3A), the edge image of the ROI ( Figure 3B), and the edge image with noise deleted ( Figure 3C). These edge images and the inner and outer rectangles were obtained in background process and were not displayed to users. After preprocessing, the FHT was applied to the edge image of each ROI. Although an ROI might contain edges from neighboring vertebrae, especially for the cases with large kyphosis or lordorsis in PA view or large scoliosis in LAT view, the FHT with vertebral shape constraints could correctly detect the inclination of the analyzed vertebra. It could be observed in the normalized PA and LAT radiographs that the vertebral shape satisfied the specific geometry relations: (1) The average angle of two endplates was titled <45°, and the average angle of the vertical edges was 45°-90°.
(2) The distance between two endplates of a vertebra was in the range of 30-60 pixels, and the distance between two vertical edges was 40-80 pixels. (3) The angle difference between two endplates or two vertical edges of a vertebra was <10°. (4) The endplates and vertical edges were close to perpendicular to each other. Prior knowledge of the vertebral shape was used to select the candidate peaks in the Hough space. Using the prior information on the shape, a couple of pairs of peaks corresponding to the lines Journal of Healthcare Engineering · Vol. 6 · No. 2 · 2015 149 (i.e., a pair of endplates and a pair of vertical edges) that best fit a vertebra were detected. Figure 3D shows the detected lines for the edge image shown in Figure 3C. See Zhang et al. [22] for more details of this algorithm. Each endplate inclination was recorded automatically. The rectangle that consisted of four fitted lines was also recorded for each lumbar vertebra, i.e., the lumbar vertebral rectangle. The position of a lumbar vertebra was defined as the center of the lumbar rectangle. Because the algorithm calculating the Cobb angle could only measure one curve at a time according to the endplate inclination, the operator should assign the curve to be measured by inputting the names of the highest and lowest vertebrae of the curve. The computer automatically calculated the Cobb angle of the curve from the inclinations of the superior/inferior endplates of the vertebrae measured on the curve. The Cobb angles of the assigned curves were displayed, as in the example shown in Figure 2. After the Cobb measurement, the major curve with the maximum Cobb angle in the coronal radiograph was automatically identified by the system.

Classification of Curve Types
The Lenke system defines six curve types based on the analysis of structural characteristics of curves, as shown in Figure 1. The structural criteria are defined as the PTB Cobb ≥25°or the PTs Cobb ≥20°for the PT curve; the MTB Cobb ≥25°or the MTs Cobb ≥20°for the MT curve; and the TLB Cobb ≥25°or the TLs Cobb ≥20°f or the TL curve. In this study, we implemented the curve type classification automatically using the computerized Cobb measurement and applying a rule-based algorithm based on the following logic similar to that proposed by Phan et al. [13] and Mezghaniet al. [14]: If there was a major curve in the MT region, the scoliosis was type 1 if both PT and TL/L curves were nonstructural, type 2 if the TL/L curve was nonstructural and the PT curve was structural, and type 3 if the TL/L curve was structural and the PT curve was nonstructural. If the major curve was in the TL/L region, the scoliosis was type 5 if the MT curve was nonstructural and type 6 if the MT curve was structural and the PT curve was nonstructural. If the PT, MT, and TL/L curves were structural, the scoliosis was classified as type 4. The classification rule is illustrated in Figure 4.
In this system, after the user assigned the curves to be measured and set the rectangles for vertebrae on radiographs, the curve type could be classified automatically, which was different from the system developed by Phan et al. [13] or Mezghani et al. [14] that required input of the measured Cobb angles first.

Identification of Lumbar Modifiers
To identify the lumbar modifier type, the central sacral vertical line (CSVL) and the ALV were identified. The operator assigned two symmetrical landmarks on the sacrum of the coronal radiograph by computer mouse clicks. According to the Lenke method [2], the lateral edge of the radiograph was used to define the vertical direction. The CSVL was determined as the vertical line passing through the 150 A Computer-aided Method for Improving the Reliability of Lenke Classification for Scoliosis midpoint of the two sacral landmarks. The ALV was identified based on the detected endplate inclinations. The candidate ALV region was first located at an area that included two vertebrae above and two vertebrae below a disc separating two vertebrae tilting in opposite directions to the horizontal. The lumbar vertebra in the region that had the greatest horizontal distance from the CSVL was then identified as the ALV. Once the ALV was identified, the system displayed the magnified coronal ROI of the ALV, as shown in Figure 2. The operator set two rectangles to cover two pedicles respectively by mouse clicks, as shown in Figure 2. The algorithm that was based on the snake model under the elliptical shape constraint automatically detected the ALV pedicle and its center, wherein the initial snake curve was obtained by fitting an ellipse into the rectangle with its long and short axes equal to the height and width of the rectangle, respectively (Figure 2). See Zhang et al. [22] for a more detailed description of this technique.
The system automatically analyzed the relation between the CSVL and the ALV. If the CSVL did not touch the ALV rectangle, the lumbar modifier was identified as type C. Otherwise, the horizontal interpedicular distance and the distance between the CSVL and each pedicle were automatically measured on the coronal radiograph. We then calculated the absolute difference between the distances of two pedicles to the CSVL. If the horizontal interpedicular distance was greater than the absolute difference, the lumbar modifier was identified as type A. Otherwise, it was considered to be type B, as shown in Figure 5.  Flowchart of the classification rule adapted from Figure 3 in Phan et al. [13] or Mezghani et al. [14].

Identification of Thoracic Sagittal Modifiers
According to the Lenke classification criteria (Figure 1), the sagittal modifier classification is based on the Cobb angle measured from T5 to T12 on the lateral radiograph. The system calculated the Cobb angle using the recorded values of the endplate inclinations measured from the lateral radiograph. It then automatically identified the sagittal modifier as hypokyphosis if the angle was <10°, normal kyphosis if the angle was 10°−40°, or hyperkyphosis if the angle was >40°.

Evaluation
Five operators participated in the experiments including a pediatric orthopedic surgeon with 10 years of experience in a scoliosis clinic, an orthopedic resident, a musculoskeletal radiologist, a medical student without experience in orthopedic radiology, and the software developer, who had no clinical experience. Each operator performed the following tasks three times each over a 3-week period: input names of the highest and lowest thoracic vertebrae, assigned the curves to be analyzed by respectively inputting names of the highest and lowest vertebrae of the curves, set the rectangles for vertebrae, identified two symmetrical landmarks on the sacrum, and set the rectangles for pedicles of the ALV on the coronal radiograph. Under the user interactions, the computer automatically implemented the functions of detecting the vertebral endplates, calculating the endplate inclination, calculating the Cobb angle for each curve, identifying the major curve, classifying the curve type, identifying the ALV, detecting the ALV pedicle and its center, drawing the CSVL, analyzing the relation between the CSVL and ALV, and identifying the lumbar modifier and thoracic sagittal modifier. Without using the computer, the surgeon and resident each also classified spinal curve patterns three times using only the chart description in the traditional Lenke method based on the traditional Cobb measurement and ALV identification. The kappa statistic (κ) in SPSS software (SPSS Inc, Chicago, IL, USA) was used to assess the variability in the Lenke classification with and without aid from the computer. Under each of the two conditions, the κ values were calculated for paired sets of classifications by each operator (intraobserver repeatability) or between operators (interobserver reliability), using all combinations of paired observations. The resulting values were averaged over combinations of pairs (intra-or interobserver) to provide an 152 A Computer-aided Method for Improving the Reliability of Lenke Classification for Scoliosis overall measure of variability. The Landis and Koch criteria [24] for κ values were adopted: 0−0.20 indicated slight agreement, 0.21−0.40 fair agreement, 0.41−0.60 moderate agreement, 0.61−0.80 substantial agreement, and 0.81−1.00 almost perfect agreement. For cases that were consistently classified by the surgeon three times without the computer's aid, the surgeon's first trials with and without the computer's aid were compared to evaluate the accuracy of the proposed system.

RESULTS
As shown by the results of intraobserver repeatability in Table 1, without computer aid, the κ values of the two operators were in the substantial agreement range (≤ 0.80) for the curve type and lumbar modifier classification, and were in the almost perfect agreement range for the sagittal modifier classification (> 0.80). With computer aid, the κ values of five operators were in the range of 0.82−0.94 (average 0.89), 0.81−0.86 (average 0.83), and 0.92−0.96 (average 0.94) for the curve type, lumbar modifier, and sagittal modifier, respectively, and all of which were in the almost perfect agreement range (>0.80). For the curve type, lumbar modifier, and sagittal modifier, the average κ values were improved from 0.77 to 0.89, from 0.79 to 0.83, and from 0.89 to 0.94, respectively; the average values of consistency were improved from 82% to 92%, from 89% to 94%, and from 97% to 98%, respectively. Use of the computerized method resulted in improvement in the κ value and consistency for both surgeon and resident, especially for those of the resident in the curve type classification.
As shown by the results of interobserver reliability in Table 2, by using the proposed method, the average interobserver κ values and consistency were improved in all three components of the Lenke classification, especially in the curve type classification. For the curve type, lumbar modifier, and sagittal modifier, the average κ values were improved from 0.65 to 0.81, from 0.75 to 0.81, and from 0.83 to 0.92, respectively; the average values of consistency were improved from 74% to 85%, from 86% to 91%, and from 93% to 97%, respectively. The overall interobserver κ values increased over the three series of Journal of Healthcare Engineering · Vol. 6 · No. 2 · 2015 153  (3) inconsistent classification of the sagittal modifier in 7 patients. Without the computer's aid, the surgeon consistently classified 49 cases in three trials. For the 49 cases, the surgeon's first trials under the conditions of with and without the computer aid were compared. Results showed that under the two conditions, the surgeon's classifications were consistent in 48 of the 49 cases (98%). The one inconsistent case was due to inconsistent classification of the TL/L curve as structural (type 1 or 3).
Processing time for each case was also measured. The average processing time was 10 minutes, including about 3 minutes for the user interaction. Processing time should decrease as the operator becomes more familiar with the software.

DISCUSSION
Spinal curve pattern classification that depends on radiographic measures is crucial for surgical planning. Reliability of the spinal deformity classification has been an important topic in the orthopedic community. Some studies have reported only poorto-fair reliability of the Lenke classification [4,5]. Ogon et al. [4] reported mean κ values of 0.73 and 0.62 for the intraobserver and interobserver reliability, respectively. Richards et al. [5] reported mean κ values of 0.60 and 0.50 for the intraobserver and interobserver reliability, respectively. The variability is mainly due to human technical and judgment errors. This article proposes a computer-aided method to reduce variability in the Lenke classification, which computerized the radiographic measurements and systematically automatized the Lenke classification.
The experimental results of this study indicated that the tasks of measuring curves and identifying the ALV-subsequently enabling classifying the curve type and modifiers-were completed more reliably with the aid of a computerized system. First, the reliability of curve measurement was improved by the computer-aided Cobb

154
A Computer-aided Method for Improving the Reliability of Lenke Classification for Scoliosis measurement method, whose accuracy and reliability had been demonstrated in our previous study [22]. Second, identification of the ALV and its pedicles and the relation between the CSVL and the ALV were more reliable with the computerized system. Third, judgment errors were reduced when the computerized classification algorithm was followed. The multiple parameters considered in the Lenke classification (i.e., angles of the PT, MT, and TL/L curves on standing and side-bending coronal radiographs and sagittal radiographs and the CSVL-ALV relation) are determined using only chart descriptions in the traditional Lenke method, which is confusing and subjective. A computerized system that is more objective and is immune to this confusion improved classification reliability. The Lenke classification is used to facilitate the objective assessment of scoliosis, i.e., to make it possible to "speak the same language" in the assessment. There is no gold standard for the classified results before surgery. Therefore, the accuracy of the system was evaluated by comparing the surgeon's classifications under the conditions of with and without the computer aid using the cases that were consistently classified by the surgeon three times without the computer aid as the gold standard. The comparison result demonstrated the accuracy of the proposed method (98% accuracy).
The computer classifier developed by Mezghani et al. [14] achieved 99.66% accuracy for the curve type classification tested on a database of 603 cases with the given values of the Cobb measurements. In their study, the measurement variability was not considered. In our study, the measurement variability was reduced by using the computerized approaches in the Cobb measurement, the identification of the ALV and its pedicles, and the analysis of the relation between the CSVL and ALV. Although the proposed method still required user judgment to obtain the radiographic measurements (i.e., to determine the curves to be measured, to set rectangles on vertebrae and on pedicles of the ALV, and to identify two sacral landmarks), few operator skills were required. Using this computerized system, even an operator with less clinical experience was able to achieve excellent reliability (e.g., κ ≥ 0.81 by the student and 0.86 by the software developer). The classifications performed by five operators using the help of the computer were more consistent than those of two operators (surgeon and resident) without computer aid. Comparison of the three trials suggested that reliability could be improved when the operator's experience using this computerized tool improved. Because the only hardware requirement was a personal computer and the software environment was Visual C++, the developed software could be easily implemented in a clinic.
The system was based on the computerized Cobb measurement. In our previous study [22], the accuracy and reliability of the proposed Cobb measurement had been demonstrated (error or variability less than 3°) for the cases satisfying the criterion of Cobb angle < 90°. The shape constraints of the FHT algorithm are reasonable for most cases in clinics. If the Cobb angle of a curve was greater than 90°, false measurement might occur due to the shape constraints of the FHT algorithm. Therefore, any cases to be measured by this system should meet the criterion of Cobb angle < 90°. Actually, most cases satisfy this criterion in clinics. For the cases that do not satisfy the shape constraints, the user should manually measure the Cobb angle before the subsequent classification.

CONCLUSION
A computer-aided system for the Lenke classification is developed in this study. Different from the 3D computerized classification methods that use 3D parameters, the developed system automatically extracts accepted clinical parameters and performs the Lenke classification. Results demonstrate that the proposed computerized tool can assist in the Lenke classification. It reduced the technical errors in the Cobb angle measurement and the ALV identification and the human judgmental errors in all three components of the Lenke classification. The computer-aided method had reliability superior to that achieved without the computer aid. It can be used equally well by individuals with little clinical experience, although it still requires user intervention. This technique might be extended to assist other classification systems. Our future work will focus on computer-aided selection of fusion levels for scoliosis correction.