Gold Coast diagnostic criteria increase sensitivity in amyotrophic lateral sclerosis

OBJECTIVE
This study evaluates diagnostic accuracy of the proposed 'Gold Coast' (GC) diagnostic criteria for amyotrophic lateral sclerosis (ALS).


METHODS
Five European centres retrospectively sampled consecutive patients referred for electromyography on suspicion of ALS. Patients were classified according to the GC criteria, the revised El Escorial (rEE) criteria and the Awaji (AW) criteria without and with the 'Possible' category (+ Poss). Reference standard was ALS confirmed by disease progression at follow-up.


RESULTS
Of 404 eligible patients 272 were diagnosed as ALS, 94 had mimicking disorders, 35 were lost for follow-up, and three had insufficient data. Sensitivity for the GC criteria was 88.2% (95% CI: 83.8-91.8%), which was higher than for previous criteria, of which the AW + Poss criteria reached the highest sensitivity of 77.6% (95% CI: 72.2-82.4%) (p < 0.001). Specificity was high for all criteria. The increase in sensitivity for the GC criteria was mainly due to the inclusion of 28 patients with progressive muscular atrophy (PMA).


CONCLUSIONS
The simpler GC criteria increase the sensitivity, primarily due to considering PMA as a form of ALS with high specificity preserved.


SIGNIFICANCE
This validation study supports that GC criteria should be used in clinical practice and may be used for inclusion in trials.


Introduction
New diagnostic criteria for amyotrophic lateral sclerosis (ALS), the Gold Coast (GC) criteria, were recently proposed by an international panel convened as a joint initiative by the International Federation of Clinical Neurophysiology, the World Federation of Neurology, the ALS Association, and the MND Association at Gold Coast, Queensland, Australia, in September 2019 (Shefner et al., 2020). The major motivation for developing new criteria was that previous criteria, the revised El Escorial criteria (rEE) (Brooks et al., 2000) with the Awaji (AW) amendment (de Carvalho et al., 2008), have limited sensitivity (Costa et al., 2012;Geevasinga et al., 2016b), are complex (Belsh, 2000, Johnsen et al., 2019, and diagnostic categories are poorly correlated with disease course (Ince et al., 2003;Traynor et al., 2000;Zoccolella et al., 2006).
The rEE and the AW criteria were intended to be used for inclusion in clinical trials, and traditionally only patients in the disease categories 'Definite' and 'Probable' have been included. More recently, it has been suggested to include patients in the 'Possible' disease category (Ludolph et al., 2015) augmenting sensitivity (Geevasinga et al., 2016b) and number of eligible patients in clinical trials (Aggarwal et al., 2010;Cudkowicz et al., 2011). The GC criteria require only one region with concomitant lower motor neuron (LMN) and upper motor neuron (UMN) involvement for the ALS diagnosis, which is similar to the AW criteria including the 'Possible' category. In addition, there are some major changes in the GC criteria: 1) only two categories 'ALS' or 'Not ALS' are considered; 2) patients with isolated LMN involvement in 2 regions, i.e. progressive muscular atrophy (PMA) (Kim et al., 2009), is considered as ALS; 3) patients with isolated UMN signs in two regions, which includes patients with primary lateral sclerosis (PLS), are not classified as ALS (Shefner et al., 2020). The rationale for including PMA patients as a form of ALS was that it is recognized that UMN signs are not always clinically evident (Shefner et al., 2020). This is supported by the progressive nature of PMA, that PMA is associated with the same risk factors for survival as ALS, and that many PMA patients develop UMN involvement as seen clinically or in post-mortem studies (Ince et al., 2003;Kim et al., 2009). The rationale for not classifying patients with isolated upper motor neuron signs in two or more body regions as ALS was that such patients ultimately may fulfil criteria for PLS, i.e. progressive UMN dysfunction in the absence of LMN signs for at least four years  and that these patients have a more protracted disease course and may never show LMN involvement (Shefner et al., 2020).
As was the case in previous criteria, findings from the clinical neurological examination may be sufficient to give the ALS diagnosis. Clinical evidence of LMN involvement is now, in contrast to what was the case in previous criteria, specified clearly as paresis and atrophy. Electromyographic (EMG) evidence of LMN involvement is, as in previous criteria, accepted as concomitant signs of ongoing denervation and signs of chronic denervation. However, in contrast to previous criteria, large motor unit potentials, defined as increased duration and/or increased amplitude, are clearly established as the only EMG marker of chronic denervation (Shefner et al., 2020). As defined in the AW criteria (de Carvalho et al., 2008), the GC criteria consider the presence of fasciculation potentials as equivalent to fibrillation potentials and positive sharp-waves as signs of ongoing denervation.
GC criteria represent expert opinion, and validation studies are required to prove their utility before they can be used in clinical practice and for inclusion in trials (Johnsen, 2020;Shefner et al., 2020). The aim of this study was to evaluate the diagnostic accuracy of the new criteria compared with previous criteria.

Methods
This multicentre study included five European centres and followed the Standards for Reporting of Diagnostic Accuracy (STARD) guidelines (Bossuyt et al., 2015). Each participating centre obtained permission to access and transfer patient data in anonymous for-mat to a database according to the rules of the local Ethical Committee.

Data collection
Each centre retrospectively sampled consecutive series of adults referred for electrodiagnostic testing between 12-10-2011 and 23-07-2020 on suspicion of ALS. Inclusion criteria as specified in the protocol were: 1) Patients referred for electrodiagnostic testing on the suspicion of ALS; 2) Age between 20-80 years; 3) Disease duration 36 months; 3) EMG including motor unit potential (MUP) analysis in at least two muscles innervated by different nerves and roots in each of the cervical and lumbosacral regions, and in one bulbar muscle (Brooks, 1994); 4) Nerve conduction studies in at least two motor and two sensory nerves; 5) Available clinical follow-up was sampled prospectively according to predefined criteria (see below). If more than one electrodiagnostic examination was performed in the inclusion period, the first one was chosen.
Study data were collected and managed using REDCap electronic data capture tools hosted at Aarhus University, Denmark (Harris et al., 2019, Harris et al., 2009 and included: (1) Demographic data, onset region, duration from onset of symptoms until neurophysiological investigation, and clinical follow-up diagnosis.
(2) Standardized clinical information organized as presence or absence of UMN signs (tick boxes for yes/no, hyperreflexia, plantar response and other relevant pathogenic reflexes, spasticity) and LMN signs (tick boxes for yes/no, weakness, wasting, fasciculations, hyporeflexia) in each of the four body regions: bulbar, cervical, thoracic, and lumbosacral.
(3) Results from other investigations (MRI of brain and spine, blood tests, cerebrospinal fluid tests, neurofilament levels, genetic tests, etc.) obtained around the same time as the electrodiagnostic investigation. (4) Data from the EMG examination including spontaneous activity (fibrillation potentials, positive sharp waves, and fasciculation potentials, each indicated as 0, +, ++, +++, +++ +), recruitment pattern analysis including firing rate, and qualitative or quantitative MUP analysis (duration and amplitude indicated as normal, increased, decreased) for each examined muscle according to the routine procedure of sampling laboratory. (5) Data from nerve conduction studies. (6) Diagnostic classification according to the GC, AW, and rEE criteria.

Data analysis
Index tests were classifications according to the GC, AW, and rEE criteria given by the laboratory performing the investigation. The diagnoses were subsequently checked by two authors (KP and BJ) and in case of disagreement the diagnosis was discussed with the laboratory performing the investigation and a consensus was reached.
Reference standard was independently determined by clinical follow-up as ALS diagnosis confirmed by death, need for assisted ventilation, and/or disease progression causing more severe functional status evaluated from patient records. Patients with other diagnoses and patients without progression for at least 12 months were classified as non-ALS.
Sensitivity and specificity were calculated for each criteria set. The rEE and the AW criteria were applied in two versions, one using the traditional approach including only the 'Definite' and the 'Probable' disease categories, and for rEE also 'Probable labora-tory supported' category, and one more recently used version also comprising the 'Possible' category (+ Poss), which have been shown to increase sensitivity (Geevasinga et al., 2016b) and have been used in clinical trials. Analyses were done for the total population of ALS patients and for four subgroups of patients with bulbar-onset, spinal-onset, disease duration 12 months, and disease duration >12 months before examination. Sample size was estimated to 291 patients to detect an increase in sensitivity from 70% to 80% or more using a confidence level of 95% and a statistical power of 80%. McNemar's or Fisher's exact test was applied to test for differences in sensitivity and specificity between the criteria. Pvalues < 0.05 were considered significant.

Inclusion
A total of 404 patients were eligible. Of these 35 were excluded because they were lost for follow-up and three were excluded because MUP analyses were not performed, leaving 366 patients for analysis (Fig. 1). The number of included patients from each centre ranged from 41 to 127 with a median of 71. Clinical follow-up confirmed ALS in 272 patients, while 94 patients were non-ALS (Fig. 1).

Demography
Demographic data are shown in Table 1. There were no differences between the ALS group and the non-ALS group with respect to age, sex-ratio, or disease duration, but onset region differed between the groups with more bulbar-onset and cervical-onset patients and fewer lumbosacral-onset patients in the ALS group. Diagnoses in the non-ALS group are shown in Table 2.
There were 38 patients with LMN signs without UMN signs and 28 of these had progressive involvement of two or more regions, i.e. PMA, and were classified as ALS according to the GC criteria, corresponding to 11.7% of the true positive patients. Without considering these 28 PMA patients as ALS patients, the sensitivity of the GC criteria would have been 78.3% (95% CI: 72.9-83.1%) with a specificity of 100% (95% CI: 96.2-100.0%).

Subgroup analyses
In the subgroup of bulbar-onset patients, sensitivity for the GC criteria was 83.5% (95% CI: 73.5-90.9%). This did not differ from that of spinal-onset patients, which was 91.2% (95% CI: 85.9-95.0%) (Fisher's exact test, p = 0.17). The sensitivity of the GC criteria in bulbar patients was not higher than the sensitivity of the AW + Poss and rEE + Poss criteria (table 3), which might be explained by the fact that there was only one PMA patient in this subgroup. In the subgroup of limb-onset patients, the sensitivity was higher for the GC criteria compared with the other criteria (table 3). There was no difference in sensitivity between the subgroup with symptom duration 12 months and the subgroup with symptom duration > 12 months (Fisher's exact test, p = 0.69), and in both subgroups the GC criteria had higher sensitivity than the four other criteria (McNemar's test, p < 0.05).

False positive diagnoses
There was one false positive diagnosis using the GC criteria. This was a 65-year-old male with a 20-month history of progressing weakness of the lower limbs fulfilling clinical and EMG criteria for LMN involvement in the cervical and lumbosacral regions, thus given a diagnosis of PMA. Clinical UMN signs (hyperreflexia and Babinski) in the lower limbs were explained by a brain MRI showing chronic vascular changes. Because of slow progression a muscle biopsy was taken 9 months later showing inclusion body myositis.
Eight patients with isolated UMN signs in 2 regions were classified as 'Possible ALS' according to the rEE and the AW criteria. Two of these eight patients were false positive as one fulfilled criteria for PLS  at follow-up and one had diagnoses of Parkinson's disease and progressing dementia 19 months after the EMG examination. The remaining six patients fulfilled the reference standard and were classified as true positive according to the rEE + Poss) and the AW + Poss) criteria, and as false negative according to the GC criteria.

Discussion
This study shows that the GC criteria, as intended by the authors, are superior to the rEE criteria and to the AW criteria by   a twofold increase in sensitivity. Compared with the rEE + Poss and the AW + Poss criteria that includes the diagnostic category 'Possible', the GC criteria are also superior as they can increase the sensitivity by about 10%, with high specificity preserved.

Comparisons with other studies
The sensitivity of the rEE + Poss and AW + Poss criteria found in our study is comparable with that of one other study (Geevasinga et al., 2016b) and slightly lower than that of a recent study (Hannaford et al., 2021). The sensitivities of the rEE and AW criteria without the 'Possible' category, as found in our study, are low compared to previous studies (Costa et al., 2012;Geevasinga et al., 2016a;Jang and Bae, 2015), but similar to the sensitivities of a more recent study (Hannaford et al., 2021). The lower sensitivity of our study compared to previous studies may be explained by the fact that we, in the application of the rEE and the AW criteria, adopted the requirement of both weakness and wasting for the clinical definition of LMN signs, as strictly defined in the GC criteria (Shefner et al., 2020). In the rEE and the AW criteria, the way to combine weakness, wasting, and fasciculations for detection of LMN involvement is not defined and different ways to combine these features, or to use them in isolation, is a possible source of variation among studies. In order to make a balanced comparison of the criteria in this study, we found it reasonable to apply only the strict definition of the GC criteria.
Another study found a sensitivity of the GC criteria of 92% and also comparable sensitivities for the AW + Poss and the rEE + Poss criteria (Hannaford et al., 2021). However, the study found 18 false positive out of 506 patients resulting in a rather low specificity of 88% for the GC criteria, and this was lower than the specificities for the AW + Poss and the rEE + Poss criteria. The relatively high number of false positive patients is not specified or explained but may be related to methodological issues as index test and reference standard apparently were mixed up.

Choice of reference standard
In studies of diagnostic accuracy, it is required that the reference standard is independent of the index test. The GC criteria includes patients with PMA, i.e., isolated LMN involvement in 2 regions, and the rEE and AW criteria include patients with isolated UMN involvement in 2 regions. Choosing a reference standard that require 'classical' ALS findings of both LMN and UMN involvement will not allow for a positive diagnosis of such patients with only LMN or UMN involvement. Our choice of reference standard, defined as progression, assisted ventilation, or death, is independent on the index tests and ensures that criteria that include patients with only LMN or UMN involvement can be tested.

Specificity
For a fatal disease such as ALS, a high specificity of diagnostic criteria is required to avoid false positive diagnosis. For the GC criteria, it might have been expected that the inclusion of PMA as a form of ALS or the absence of recommendations for obligatory clinical investigations, e.g. MRI or cerebrospinal fluid analyses, to exclude mimicking diseases, could result in more false positive diagnoses but the data of the present study do not support this.

Progressive muscular atrophy and isolated upper motor neuron signs
The consideration of PMA as a form of ALS in the GC criteria provides the major contribution to their increase in sensitivity. The background for considering PMA as a form of ALS in the GC criteria was based on the findings that UMN signs often can be recognized later in the course of the disease or post-mortem (Ince et al., 2003;Kim et al., 2009) and that PMA and ALS share prognostic risk factors (Kim et al., 2009). There are, however, some differences in that PMA patients have higher mean age at onset, higher proportion of men, and longer mean survival time (Kim et al., 2009) and it may still be debated whether PMA should be regarded as a separate clinical entity or as one end of the ALS spectrum (Swinnen and Robberecht, 2014). The data of the present study support that PMA should be included in the diagnostic criteria as 27 of 28 patients classified as ALS (PMA) according to the GC criteria fulfilled the reference standard. It can, however, not be excluded that choosing a reference standard with longer follow-up time might have revealed that some patients had other diagnoses.
Interestingly, the inclusion of PMA in the diagnostic criteria weakens the need for objective methods for detection of UMN involvement in order to increase the sensitivity (de Carvalho et al., 2008).
In 11.7% of the true positive patients according to the GC criteria, UMN signs were not detected. This is a higher percentage of PMA patients than estimated in other studies (Kim et al., 2009;Swinnen and Robberecht, 2014) and this may emphasize the importance that also more subtle UMN signs as preserved reflexes in a wasted weak limb or spread of reflexes might be detected.
The decision not to include patients with isolated UMN signs in 2 regions in the GC criteria, opposite to what is the case for the rEE and AW criteria, seems justified, as two of eight patients were false positives, and specificity may have priority to sensitivity diagnosing ALS.

Strengths and limitations
The design as a multicentre study with inclusion of consecutive patients referred to five centres is expected to cover the diversity in clinical practice taking into consideration local differences, and thereby reflecting the everyday clinical settings in which the diagnostic criteria for ALS will be used. A limitation of this design is that an identical standardized examination program was not applied on all patients as the examination programs to exclude other diseases were performed according to local practices. Testing the criteria in rare mimicking diseases such as Kennedy disease and multifocal motor neuropathy would require another study design with a different selection procedure. Our multicentre design with possible differences in referral patterns implies that a broader spectrum of patients was included comprising patients with minor clinical suspicion of ALS in whom it was not possible to give a specific diagnosis at follow-up.
Only authors from the same department discussed disagreement, and involving experts from different centres would have been favourable.
The examinators were prompted first to apply the index test and then the reference standard by looking up follow-up information. It was, however, not possible to blind the examinators for the follow-up data, which is a possible source of bias.
The protocol excluded patients >80 years and patients with a disease duration of >36 months, and the results cannot necessarily be transferred to very old patients and patients with unusual long disease duration. This may for example apply to PLS patients that generally have longer disease duration .
A study of diagnostic methods might include analyses of interrater variability. This requires that multiple physicians give diagnoses on the same set of patient cases, which was not within the scope of this study. The simplicity of the GC criteria, as compared with the rather complex rEE and AW criteria, may, however, suggest that inter-rater variability is lower for the GC criteria.
Patients with other diagnosis than ALS or without progression for 12 months were classified as non-ALS at follow-up. It cannot be excluded that some atypical ALS patients with very slow progression, for example with flail limb syndrome, were misclassified resulting in an underestimation of the sensitivity. Except for two patients that were followed for 14 and 15 months, all non-ALS patients were followed for 24 months or more. Possible misclassifications of such atypical patients would lower the sensitivities but would not have impact on the high specificities of the criteria.

Impact
The GC criteria are intended to be used both clinically and for inclusion in trials in contrast to the rEE and the AW criteria that were intended to be used only for inclusion in trials. The present study validates the use of the GC criteria in clinical practice, which may additionally improve medical management of PMA patients. The present study also supports the use of the GC criteria for inclusion in trials, which might give the opportunity to start diseasemodifying therapies earlier in patients with isolated LMN signs but not in patients with isolated UMN signs. If more homogeneous patient groups are desired in trials, additional criteria, for example exclusion of PMA patients, may be used (Shefner et al., 2020).

Conclusions
This study provides evidence that the proposed GC diagnostic criteria, mainly due to the consideration of PMA as a form of ALS, in adults 80 years with a disease duration 36 months increase the sensitivity without reducing the specificity as compared with rEE and AW criteria. The results suggest that the GC criteria should be used as diagnostic criteria for ALS in clinical practice and may be used for inclusion in trials.
Author contributions Kirsten Pugdahl: Design and conceptualized study; major role in the acquisition of data, analysed the data; drafted the manuscript for intellectual content.
Jean-Philippe Camdessanché: Design and conceptualized study; major role in the acquisition of data, revised the manuscript for intellectual content.
Bülent Cengiz: Design and conceptualized study; major role in the acquisition of data, revised the manuscript for intellectual content.
Mamede de Carvalho: Design and conceptualized study; revised the manuscript for intellectual content.
Rocco Liguori: Design and conceptualized study; revised the manuscript for intellectual content.
Celine Rossatto: Major role in the acquisition of data, revised the manuscript for intellectual content.
Miguel Santos: Major role in the acquisition of data; revised the manuscript for intellectual content.
Veria Vacchiano: Major role in the acquisition of data; revised the manuscript for intellectual content.
Birger Johnsen: Design and conceptualized study; major role in the acquisition of data; analysed the data; drafted the manuscript for intellectual content.
Declarations of interest Mamede de Carvalho is co-author of the paper suggesting the GC criteria (Shefner et al., 2020).

Declaration of Competing Interest
The authors declare that they have no known competing financial interests or personal relationships that could have appeared to influence the work reported in this paper.