Introduction

The International Standards for Neurological Classification of Spinal Cord Injury (ISNCSCI)1 published by the American Spinal Injury Association (ASIA) are an established and well-investigated2 assessment for quantifying the neurological deficits resulting from a human spinal cord injury (SCI). Over the last decade the ISNCSCI has become the de-facto standard for classification of spinal cord-injured subjects. This is reflected by several important recommendations for clinical trials: (1) SCI-focused journals recommend the ISNCSCI and especially the ASIA Impairment Scale (AIS) in their author guidelines; (2) The International Campaign for Cures of Spinal Cord Injury Paralysis (ICCP) recommends ISNCSCI as inclusion/exclusion criterion, for stratification, subgrouping and as outcome measure.3, 4, 5, 6

ISNCSCI requires two differing types of skills, namely testing and scoring skills.7 Reliable sensory and motor testing is achieved through practice in a heterogeneous patient collective preferably consisting of incomplete spinal cord lesions (for data on reliability of this part see Marino et al.8 and Savic et al9).

It has been requested in several publications10, 11 that ISNCSCI raters need to be expertly trained. However, little evidence is available on the outcome of formal training and the most frequent pitfalls in a larger cohort.

Thus, the aim of this investigation has been to (1) quantify the effectiveness of a formal training on the classification accuracy on the basis of ISNCSCI’s motor and sensory examination and to (2) identify the most difficult and error-prone rules and definitions within the ISNCSCI framework.

This work has been performed as a part of the quality management system of the European Multicenter Study on Human Spinal Cord Injury (EMSCI—http://emsci.org)12 and was one of the preconditions for its successful certification (ISO 9001:2008).

Materials and methods

Formal ISNCSCI training

The ISNCSCI instructional courses within the EMSCI network are conducted twice a year (preferably one in English, one in German). The training is performed by experienced ISNCSCI examiners and raters. Teaching is based on the 2003 ISNCSCI reference manual including the latest additions and clarifications from ASIA13, 14 and the 2011 revised and reprinted ISNCSCI booklet.1 Both examination and scoring/classification skills are taught during the two days of training. The first day (5 h) is focused on providing the participants with the theoretical background of the examination and performing examinations on healthy subjects. The second day (7 h) is split into a practical session of examining patients and a scaling, scoring and classification part. The latter part lasts 3 h, starting with a theoretical session including a formal presentation, the classification of the previously examined patients with participation of all attendees and a group discussion. There, on the basis of Kirshblum15 collection of difficult cases the most challenging cases in terms of proper classification are discussed. The following topics are particularly emphasized based on the consensus of the experienced raters:

  • Not assessable myotomes: C2−C4, T2−L1, S2−S5 and the corresponding rules to predict motor function from the sensory assessments, including the implication on the motor level (ML) determination.

  • A myotome is considered as intact, if it is graded 3 or 4 and the next rostral myotome is graded as intact including the implication on the ML determination.

  • The mandatory presence of sacral sparing as the definition of an incomplete lesion (decision AIS A vs AIS B or C or D).

  • Sparing of motor function below the ML as decisive parameter for determination of a motor incomplete lesion (AIS B vs AIS C or D).

  • The number of key muscles graded 3 or better below neurological level of injury as differentiation criterion between AIS C and AIS D.

Pre and post cases

The ISNCSCI examinations of five patients (Table 1) from the EMSCI database serve as test cases, which are different from the training cases. All cases include the full data of the sensory, motor and anorectal examinations printed on the ‘2000 rev.’ revision of ISNCSCI assessment sheet and have been carefully selected in order to reflect difficult aspects of scaling, scoring and classification. Case 1 represents a cervical complete lesion with sensory function of segment C3 being already bilaterally impaired in light touch sensation, but with preservation of motor function more than three levels below the ML. Applying the ‘ML follows sensory level’ rule for not assessable myotomes, left and right ML are both C2 in this particular case. Case 2 is a patient with a motor incomplete lesion without voluntary anal contraction, but myotomes C5 graded 4 on both sides. Motor function is preserved until T1, that is, exactly four segments below the ML. Case 3 outlines a motor incomplete central cord syndrome with unusual preservation of pinprick sensation in all dermatomes. Case 4 is a borderline case between a sensory incomplete and a motor incomplete lesion. MLs in the thoracic region are determined by applying the ‘ML follows sensory level’ rule. In this patient motor function is preserved exactly three segments below the ML. Case 5 delineates a low lumbar lesion. Even though the sensory function on the right side is normal, the motor examination reveals a grade 4 in myotome S1. All rostral myotomes on this side are graded as normal. The neurological level of injury is segment L3; voluntary anal sphincter contraction is present. There are six assessable myotomes below L3, exactly three of them having a muscle grade equal to or greater than 3.

Table 1 Overview of pre-/post-test cases

Data handling and evaluation

All participants were asked to rate and classify the five testing cases before (pre-test) and the same cases after the instructional course (post-test). Classification variables include right and left ML, right and left sensory level, as well as severity of injury: complete or incomplete, the AIS and the zones of partial preservation (sensory and motor for left and right side). This results in 50 questions (5 cases × 10 variables) per rater and test. Neither the testing cases nor the pre-test results were reviewed during the course. Participants were instructed to work on their own. Instructors prevented teamwork.

Difficulty levels of these variables were determined by counting the proportion of correct answers. The most difficult variables were further analyzed. Every error was classified (by the authors CS and RR) into predefined error classes.

In addition to the pre-test participants were asked to complete a questionnaire about their occupation (physician, physical therapist, occupational therapist, other rehabilitation professional), their self-rated experience in ISNCSCI (novice, experienced, highly experienced, expert), their self-rated experience in SCI medicine (<1 year, 1–5 years, 6–10 years, >10 years) and frequency of conducting ISNCSCI examinations (none, once a month, once a week, twice a week, once a day).

For analyzing the influence of these factors in a variance analysis the number of subgroups had first to be reduced to get a substantial number of cases in each subgroup. The resulting pooled subgroups are listed in Supplementary Table 1.

Data were stored in a custom ACCESS 2003 (Microsoft Corp., Redmond, Seattle, WA, USA) database. Statistics were performed with Statistica 7.1 (StatSoft Inc., TULSA, OK, USA) using repeated-measures analysis of variance. Spearman’s rank correlation coefficients were used for testing correlation hypotheses.

Results

A total of 106 persons attended the workshop in 10 instructional courses since 2006. Three were held in English language, seven in German language. The participants’ occupation, experience in SCI/ISNCSCI and frequency of conducting ISNCSCI examination are shown in Figure 1. The majority of the attendees were physicians (59.4%). Almost one half of them had less than 1-year experience in SCI (48.1%). Of these, 57.6% were novice to the ISNCSCI. About one-third of the participants performed one ISNCSCI examination per week (31.1%).

Figure 1
figure 1

Basic characteristics of the participants in the ISNCSCI instructional courses.

In the pre-test 2628 out of 5300 questions (49.6%) were answered correctly. After the instructional course every participant improved in post-testing. The number of correct answers increased significantly (P<0.00001) to 4849 out of 5300 (91.5%). Twelve participants (11.3%) answered all questions correctly. The percentage of correct answers split into classification variables is shown in Figure 2. In post-testing the accuracy is highest for rating sensory levels (96.8%) and completeness (96.2%), whereas MLs (81.9%) and the AIS (88.1%) are more difficult to determine correctly.

Figure 2
figure 2

Percentages of correct answers of pre- and post-test split into single ISNCSCI variables: AIS, completeness, ML, motor zones of partial preservation, sensory levels and sensory zones of partial preservation.

The ML error classification (Table 2) reveals that the vast majority of errors (69.7%) refers to the ‘motor follows sensory’ rule in clinically not assessable myotomes (C1−C4, T2−L1 and S2−S5), where the ML is presumed to be the same as the sensory level. The tests contain two cases where this rule is applicable: (1) in the high cervical region, which accounts for 98.5% of this error class, and (2) in the thoracic region, which accounts for only 1.5%.

Table 2 Classification of errors in motor level determination

The error analysis of the AIS classification is listed in Table 3. The motor incompleteness criterion accounts for 58.7% of the errors. The tests contain two cases with borderline decision between AIS B and AIS C or D: (1) case 2 has a sparing of motor function four levels below the ML, which accounts for 48.7% of this error class; (2) case 4 has a sparing of exactly three levels (51.4%).

Table 3 Classification of errors in ASIA impairment scale determination

The sacral sparing as criterion for an incomplete lesion is another source of errors (25.4%). In this error pattern an AIS C lesion is misclassified as AIS A although impaired light touch sensation is present in S4−S5.

The results of the analysis of variance reveal that none of the pooled (Supplementary Table 1) items (Figure 1) ‘occupation’, self-rated ‘experience in ISNCSCI’ and ‘frequency of conducting ISNCSCI examinations’ had a significant influence on the ISNCSCI classification performance (Supplementary Tables 2 and 3).

Two secondary analyses of variance were performed to analyze the influence of the factors ‘training language’ and ‘teacher scaling, scoring and classification’ on the post-test results. No significant differences were found (Supplementary Table 2).

The self-rated experience in ISNCSCI is significantly correlated with the pre-test results (Spearman’s ρ=0.2593, P<0.05, Figure 3) but not with the post-test results.

Figure 3
figure 3

Correlation between number of correct answers in pre-testing and post-testing and self-rated ISNCSCI experience.

Discussion

The aim of the presented work was to assess the efficacy of formal instructional courses on scoring, scaling and classification according to the ISNCSCI. All participants improved their classification skills in each variable (sensory levels, MLs, completeness, AIS, motor/sensory zones of partial preservation), which in principle provides evidence for the success of the instructional courses. The overall success rate for correct answers after training (91.5%) is in line with other publications investigating the ISNCSCI classification properties. Chafetz et al.16 report a classification rate of 90% using a similar pre-/post-test approach (28 attendees, 10 patients). Cohen et al.7 (106 attendees, 2 patients) find good post-test results for an AIS A case (71% correct answers in respect to ML, 100% in respect to completeness) but poor post-test results for the AIS D case (21% correct answers in respect to ML, 97% in respect to completeness).

In addition to previous assessments of instructional course efficacy a more in-depth error analysis was performed in the present study. Every error in ML and AIS determination was classified to get insights into the underlying mechanisms. This revealed two problematic decisions: (1) The ‘motor follows sensory’ rule is well accepted in the thoracic region, but not in the high cervical region. It seems to be counterintuitive to look first at the sensory scores of C2−C4 to determine a ML. (2) The definition of a motor incomplete lesion, which involves the by itself difficult variable ML as a reference level to determine sparing of motor function. The exact underlying mechanisms cannot be answered quantitatively with the current set of cases. However, similar wording in the motor incompleteness definition (‘sparing of motor function more than three levels below the ML’) as compared to the AIS C vs AIS D definition (‘at least half of the key muscle below the neurological level must have a muscle grade equal to or greater than three’) might explain the high error rates. The appropriate use of the according reference level (ML vs neurological level) seems to be confusing, as well as the counting criterion involving the word ‘three’. Apparently, the later causes more confusion and could be avoided by rephrasing the motor incompleteness definition to ‘at least four levels below the ML’. Two of the five testing cases contained borderline decisions between AIS B and AIS C/D. This difficulty might explain the error rate of 12% in determination of the AIS.

The large number of untrained participants in the instructional courses most likely explains the low average pre-test performance (56.5%). The selection of difficult cases prevented an ideal performance with 100% correct answers in post-testing and explains that only 12 participants (11.3%) answered all questions correctly. Another 37 attendees (34.9%) misclassified only the ML of case 1. Hence, an excellent success rate of 46.2% would have been achieved by excluding this extremely difficult case from the evaluation.

However, it was intended to challenge even experienced raters in order to avoid ceiling effects in post-testing. Of course, a more representative selection of SCI cases would have improved the classification performance.

‘Experience in ISCNSCI’ represents the only factor that significantly influenced the pre-test performance. After training the classification performances are no longer affected by the self-rated experience in ISNCSCI, which supports the efficacy of ISNCSCI training courses.

In line with previous findings,16 the profession did not affect the classification performance. In other words, professionals from different medical fields, for example, physical therapists, research associates and medical doctors, are equally capable of acquiring the respective skills.

Despite different language backgrounds within the European-wide EMSCI consortium teaching of international trainees in English did not yield a post-test performance any different from courses given in German for participants from German-speaking countries.

The presented findings need to be considered in clinical trials for properly calculating sample sizes and estimating the statistical power. ISNCSCI training is an effective method to increase examination reliability.17 Classification accuracy is drastically increased from 48 to 91% by training as shown in this work.

ISNCSCI training in general should be based on the latest ASIA references.1, 13, 14 These publications supersede the 2003 ISNCSCI reference manual and the 2000 booklet, which should no longer be used for referencing ISNCSCI.

The two most error-prone rules in ISNCSCI include the inherent ISNCSCI problem of not clinically testable myotomes C2−C4, T2−L1 and S2−S5. We strongly suggest reiterating this definition and highlighting its high relevance for appropriate classification in ISNCSCI training courses and updated versions of the reference manual.

A recent publication14 on ISNCSCI clarified the ML determination for the ‘transition zones’ C4−C5 and L1−L2. Assuming that the C5 or L2 myotome is graded as 3 or 4, the ML determination depends on the status of sensory testing at C4 or L1. The ASIA introduced algorithms to predict the motor function from the corresponding light touch and pin prick examinations for the segments C4 and L1.14

Recently we proposed a similar but more generalized approach.18 We suggested calculating virtual motor scores (predicted from light touch and pin prick) not only for C4 and L1, but for all clinically not assessable segments, so that sensory scores act as proxies for motor scores in these segments. Although this ‘proxy’ method was primarily developed for computational ISNCSCI scaling, scoring and classification, this procedure is offered in the instructional courses as an additional method for determining MLs. This procedure allows determining MLs as easy as sensory levels, once all virtual muscle grades are calculated.

Computational instead of manual ISNCSCI classification18, 19, 20 helps to avoid the human error source. As a consequence the EMSCI project provides a freely available web application for online ISNCSCI classification on http://ais.emsci.org as a training tool. In addition, the EMSCI ISNCSCI algorithm is capable of handling datasets containing not testable segments.18 Not testable segments are clinically relevant. About 9% of all datasets in the EMSCI database contain not testable segments.18 However, official rules for not testable handling are missing in the most recent version of the ISNCSCI.

Nevertheless, a profound knowledge of the classification framework is required to avoid complete dependence on computational ISNCSCI classification and to make proper conclusions, when interpreting respective data. For example, the clinically meaningful interpretation of AIS A, which is defined by the absence of sensory and motor function in segment S4−S5 and not by a complete transection of the spinal cord. Misclassifications such as classifying a paraplegic as a tetraplegic because of a concomitant peripheral nerve injury can be avoided by skilled raters, who can interpret examination values properly.

Conclusion

Overall, this study illustrates that ISNCSCI instructional courses are mandatory in order to maintain the highest possible level of examination, classification and interpretation skills, both in novices and experts. The clinically not assessable myotomes were identified as the most prominent sources for errors in the ISNCSCI classification process.

Data archiving

There were no data to deposit.