Cortical involvement determines impairment 30 years after a clinically isolated syndrome

Abstract Many studies report an overlap of MRI and clinical findings between patients with relapsing-remitting multiple sclerosis (RRMS) and secondary progressive multiple sclerosis (SPMS), which in part is reflective of inclusion of subjects with variable disease duration and short periods of follow-up. To overcome these limitations, we examined the differences between RRMS and SPMS and the relationship between MRI measures and clinical outcomes 30 years after first presentation with clinically isolated syndrome suggestive of multiple sclerosis. Sixty-three patients were studied 30 years after their initial presentation with a clinically isolated syndrome; only 14% received a disease modifying treatment at any time point. Twenty-seven patients developed RRMS, 15 SPMS and 21 experienced no further neurological events; these groups were comparable in terms of age and disease duration. Clinical assessment included the Expanded Disability Status Scale, 9-Hole Peg Test and Timed 25-Foot Walk and the Brief International Cognitive Assessment For Multiple Sclerosis. All subjects underwent a comprehensive MRI protocol at 3 T measuring brain white and grey matter (lesions, volumes and magnetization transfer ratio) and cervical cord involvement. Linear regression models were used to estimate age- and gender-adjusted group differences between clinical phenotypes after 30 years, and stepwise selection to determine associations between a large sets of MRI predictor variables and physical and cognitive outcome measures. At the 30-year follow-up, the greatest differences in MRI measures between SPMS and RRMS were the number of cortical lesions, which were higher in SPMS (the presence of cortical lesions had 100% sensitivity and 88% specificity), and grey matter volume, which was lower in SPMS. Across all subjects, cortical lesions, grey matter volume and cervical cord volume explained 60% of the variance of the Expanded Disability Status Scale; cortical lesions alone explained 43%. Grey matter volume, cortical lesions and gender explained 43% of the variance of Timed 25-Foot Walk. Reduced cortical magnetization transfer ratios emerged as the only significant explanatory variable for the symbol digit modality test and explained 52% of its variance. Cortical involvement, both in terms of lesions and atrophy, appears to be the main correlate of progressive disease and disability in a cohort of individuals with very long follow-up and homogeneous disease duration, indicating that this should be the target of therapeutic interventions.


Introduction
Clinical outcomes in multiple sclerosis are highly variable, develop over decades and to date no biomarker has been proven to robustly explain levels of disability, or to reliably distinguish between relapsing remitting (RRMS) and secondary progressive (SPMS) disease. In part this is likely due to previous studies assessing patients who were at different points in their multiple sclerosis disease course or were studied only for a short period of follow-up. Studies comparing patients with disease durations of less than two decades are likely to be hampered by a proportion of patients in the RRMS group who will eventually go on to develop SPMS, 1 and therefore may already display biomarker features of progressive disease. In cross-sectional studies comparing RRMS and SPMS disease duration is usually not matched, thereby making it difficult to determine whether a given biomarker is associated with disease subtype independently of disease duration. 2 We recently completed a longitudinal prospective 30-year follow-up of patients recruited soon after a clinically isolated syndrome (CIS) suggestive of multiple sclerosis, to determine the long-term clinical outcomes and their relationship with brain lesions within the first 5 years. 3 In the 120 patients with known outcome, around a third remained classified as having had a CIS, a third developed SPMS or died because of multiple sclerosis, while the rest had RRMS; of the RRMS group, $90% of patients were able to walk without major limitations. 3 Of those who developed SPMS, nearly all converted within 20 years after symptom onset.
In the present study, we aimed to determine the MRI correlates of progressive disease and disability after 30 years. In this cohort, RRMS and SPMS are well matched for disease duration and given the 30-year follow-up, the RRMS subgroup is likely to be phenotypically relatively 'pure', containing few patients who have yet to develop SPMS. Numerous brain and spinal cord MRI metrics have been proposed to explain disability levels in multiple sclerosis 4 and measures of cortical 5 and spinal cord involvement 6 were most consistently found among the correlates of present and predictors of future disability. Limitations of those previous investigations include the analysis of individual factors, both neurologically and radiologically, as well as with limited duration of follow-up.
We thus set out to use multi-parametric MRI in a CIS cohort studied 30 years after presentation to determine to what extent MRI measures, including white matter lesions, cortical lesions, grey matter volumes, magnetization transfer ratios (MTRs) and cervical spinal cord volumes, distinguish RRMS from SPMS and explain a broad variety of neurological and cognitive outcomes.

Study cohort
This study is based on the analysis of 30-year follow-up clinical and MRI data of a cohort of patients with a CIS, which have been described before. 3 Briefly, 132 patients with a CIS suggestive of multiple sclerosis were prospectively recruited between 1984 and 1987 at the National Hospital of Neurology and Neurosurgery, and Moorfields Eye Hospital and followed over 30 years. At 30 years, clinical outcome data were obtained in 120 participants, 3 of whom 63 had MRI data, and they are the subject of this report. The 2010 McDonald MS diagnostic criteria were used. 7 Twenty-seven patients developed RRMS, 15 SPMS and 21 experienced no further neurological events. As recruitment predated the disease modifying treatment era, the cohort was largely untreated. Only nine (14%) had a disease modifying treatment at any point, all of which were first-line injectable drugs, with the earliest beginning 10 years after multiple sclerosis diagnosis (when disease modifying treatments first became available in the UK). Of these, five had SPMS at 30 years, and four had RRMS. 3

Clinical assessment
Disability was assessed using the Extended Disability Status Scale (EDSS), the timed 25-foot walk (T25FWT) and the 9-Hole Peg Test (9HPT) of the non-dominant hand. Cognitive outcome scores included the paced auditory serial addition test (PASAT), which is a subtest of the Multiple Sclerosis Functional Composite Score, 8 as well as the Brief International Cognitive Assessment For Multiple Sclerosis (BICAMS) scores 9,10 with its three components: The Revised Brief Visuospatial Memory Test (BVMTR), the Symbol Digit Modalities Test (SDMT), and the California Verbal Learning Test (CVLT). BICAMS z-scores were obtained in 60 patients with zscores (adjusted for age, sex and years of education according to the population data from the BICAMS consortium 10 ) thus excluding in total five CIS, six RRMS and seven SPMS cases older than 65 years from this subanalysis.
This study was approved by our institutional ethics committee and the National Research Ethics Service (15/LO/0650). Participants gave informed written consent.

Image acquisition
MRI was performed using a 3 T Achieva system (Philips Healthcare) and a 32-channel receive head coil.
The scan protocol included a 3 D fluid attenuated inversion recovery (FLAIR) acquired in the sagittal plane with repetition/inversion time: 4800/1650 ms and echo time: 297 ms, a voxel size of 1 Â 1 Â 1 mm 3 . T 2 -weighted axial scans were acquired with repetition time: 4375 ms, echo time: 85 ms, and a voxel size of 0.5 Â 0.5 Â 3 mm 3 . 3 D T 1 -weighted images were acquired sagitally using a fast field echo (FFE) sequence with repetition/echo time: 7.1/3.2 ms; inversion time: 848 ms, flip angle (a) = 8 and a voxel size of 1 Â 1 Â 1 mm 3 , covering also the cervical spinal cord. Magnetization transfer ratio (MTR) was calculated based on a Three dimensional slab-selective FFE sequence with two echoes (repetition time: 6.5 ms, echo time 1/echo time 2: 2.8/4.4 ms, a = 9 ), acquired with and without a sinc-Gaussian shaped magnetization transfer prepulse of nominal a = 360 , offset frequency 1 kHz, duration of 16 ms and a voxel size of 1 Â 1 Â 1 mm 3 . We used a turbo field echo (TFE) readout (echo train length of four, TFE shot interval 32.5 ms), total time between successive magnetization transfer pulses: 50 ms. Phase-sensitive inversion recovery (PSIR) was acquired axially with a repetition/inversion time of 7302/ 400 ms and echo time of 13 ms, and a voxel size of 0.5 Â 0.5 Â 2 mm 3 .

Image analysis
N4-bias field correction of T 1 -weighted scans was performed to reduce intensity inhomogeneity. 11 White matter lesion segmentation was performed automatically with Bayesian model selection 12 using jointly FLAIR and 3 D-T 1 images and manually edited (L.H.) using the 3 D-Slicer. 13 Additionally, manual lesion counting (using JIM, version 6, Xinapse Systems) was performed for infratentorial, juxtacortical, deep white matter and periventricular white matter lesions by raters (K.C., F.B.) blinded to the clinical status. T 1 -hypointense white matter lesions were filled with a multi-modality non-local mean algorithm with the most plausible texture. 14 Thereafter, brain parenchymal fraction, grey matter fraction and thalamic volume (corrected for the total intracranial volume) were computed using an atlas-based segmentation method. 15 MTR maps were calculated using the following equation: We report the average cortical MTR derived from the inner and the outer cortical bands that were generated as reported previously. 16 Similarly, the brain white matter was segmented into 12 concentric bands, based on the distance between the ventricular walls and the cortex and the mean MTR values were calculated in each band. 17 As reported previously, the first and last bands (nearest to the ventricular and cortical surfaces) were excluded from further analysis to control for partial volume effects. 18 Within the normal appearing white matter, a gradient was calculated via the equation (MTR in band 3 to MTR in band 1)/2. 17,18 Within white matter lesions the average lesional MTR per individual is reported (see Supplementary Fig. 1 for details).
Lesions with clear morphological evidence of cortical involvement 19 were manually counted on PSIR using 3 D slicer software in consensus blinded to the clinical status (L.H., O.G.). In case of disagreement, decision was reached by expert opinion (F.B.). White matter lesions counts, marked in consensus (K.K.C. with F.B., D.T.C., or both), were available from our earlier analysis. 3 The cervical spinal cord volume (was measured on T 1 -weighted brain scans with the cord finder tool in JIM (version 6, Xinapse Systems), using an active surface model without straightened cord. 20 Seed points were manually placed in the centre of the cord starting at the C3/4 disk level and continued consecutively rostral for the next adjacent 40 slices (1 mm slice thickness). The segmentation was reviewed for accuracy and manually edited when necessary (L.H.) and was successful in 57/63 individuals. Cervical spinal cord lesions were not assessed as they are rarely seen on T 1 -weighted images (such as those used to measure cord atrophy in this study) and the T 2 -weighted brain images did not include the cervical cord.

Statistical analysis
Group differences in the distribution of clinical outcome variables and MRI biomarkers between the 30-year outcome defined groups were estimated with linear regression models adjusting for age and gender. Lesion counts were not corrected for head size. The beta with 95% confidence interval (CI), its corresponding P-value, the explained variance (R 2 ) and the overall P-value are reported.
Linear regression models were computed to explain clinical outcome measures at 30 years based on MRI metrics at 30 years. To allow a comparison of the effect size of different MRI biomarkers [e.g. (n) lesion counts versus (ml) volume] on a given clinical outcome measure, standard scores were computed as the fractional number of standard deviations, by which each observed value, e.g. a lesion count, is above or below the mean value of the whole group.
Stepwise selection models (i.e. sequential replacement) were used for parameter selection. This process begins without predictors and iteratively adds the most contributing predictors (based on the Akaike information criterion) until the improvement is no longer statistically significant (comparable to forward selection). However, for each new variable added, variables that no longer provide an improvement in the model fit are also eliminated (comparable to backward selection). Cervical spinal cord volume and thalamic volume were corrected for total intracranial volume.
MRI and clinical outcome scores are provided as means with standard deviations (SD) or as medians with 25% to 75% range, as appropriate. Statistical estimates are reported with a 95% CI and the corresponding P-value (two-tailed, and exact until P 5 0.001) for this estimate.
The statistical analysis was performed with R-studio. 21 P-values 5 0.05 (two-tailed) were considered statistically significant.
summarized along with their quantitative group differences, obtained from regression models in Table 1 (see also Chung et al. 3 for more details).
Adjusting for age and sex, the distributions of clinical outcome measures were well separated between RRMS and SPMS cases, whilst they were largely overlapping between CIS and RRMS, with the exception of the 9HPT, which was more abnormal in RRMS than CIS ( Fig. 2 and Table 1). SPMS showed worse scores for the EDSS, T25FWT, and 9HPT (P 5 0.001) than RRMS (Table 1). Differences between RRMS and SPMS were also found for cognitive outcome measures, including the revised BVMTR-z and SDMT-z, which showed a greater abnormality in SPMS than RRMS (Table 1). No differences in the cognitive tests were seen between RRMS and CIS.

MRI characteristics
The distribution of quantitative MRI measures within the 30-year outcome defined groups (CIS, RRMS and SPMS) are visualized in Fig. 3 and summarized with their quantitative group differences, obtained from regression models, in Table 2 (a version based on zscores is available in Supplementary Table 3).
Adjusting for age and sex, the greatest difference between RRMS and SPMS was present in the number of cortical lesions, followed by grey matter fraction. Cortical lesions were seen in 3 of 27 RRMS patients (mean 0.2, SD 0.7) and in all SPMS (n = 15, mean 2.2, SD 1.5) patients. Of note, the three RRMS patients with cortical lesions had the highest EDSS scores among all RRMS [median EDSS 1.5 (25% to 75%, range 1.0-2.0) (Fig. 3, Table 2 and  Supplementary Table 3).
Adjusting for age and sex, the greatest difference between CIS and RRMS were observed in higher white matter lesion counts in RRMS, followed by lower lesional MTR in RRMS. The mean white matter lesion count in individuals with CIS was 12 (SD 19.7), compared to 94 (SD 69.1) in RRMS, (P 5 0.001) (Fig. 3, Table 2 and  Supplementary Table 3).
Graphical examples of the typical MRI patterns that emerged are shown in Fig. 4.

Explaining physical and cognitive impairment by MRI
Adjusting for age and sex, linear regression models with stepwise selection were performed on z-scores, to allow the effect sizes of different variables to be compared amongst each other.
When considering the cognitive outcomes, cortical MTR emerged as the strongest significant predictive factor in all models to explain cognitive outcome measures (SDMT, CVLT, BVMTR, PASAT). In particular, cortical MTR (beta: 0.87, 95% CI: 0.257 to 1.482) was the only explanatory variable that remained statistically significant and explained up to 52% (R 2 ) of the variability of SDMT, with estimates of cortical lesions and estimates above the significance level for white matter lesions, brain parenchymal fraction and cervical spinal cord volume above the statistical significance level (

Discussion
In the present cohort, despite 30 years of follow-up, 43% remained phenotypically classified as having RRMS with disability levels not too different from those who remained CIS (Table 1), raising the question of which factors discriminate SPMS from RRMS. In this unique cohort with a long and homogenous disease duration, we found that cortical lesions were the clearest determinant of a progressive disease course, and that cortical involvement 30 years after symptom onset was the dominant factor explaining both physical and cognitive outcomes.
Cortical lesions were absent in all individuals who remained CIS, found in three of 27 patients who developed RRMS patients but were present in all 15 individuals affected by SPMS. While greater numbers of cortical lesions have been associated with progression in both MRI 22 and histopathological studies, 23 the differences were less distinct compared with the present study. Previous histopathological studies have shown grey matter lesions to be present in acute early 24 and RRMS 25 patients. We hypothesize that this can be explained on the one hand by having studied RRMS groups cross-sectionally with short follow-up and disease duration, thus including individuals who would potentially go on to develop SPMS. 26 On the other hand, subpial demyelination, accounting quantitatively for most of the cortical involvement, as depicted by immunohistochemistry for myelin antigens, 27 still remains undetected by current clinically available in vivo MRI, irrespective of the applied imaging techniques. 28 Leukocortical lesions, therefore, only reveal the tip of the iceberg. Our lack of sensitivity for detection of cortical lesions in cases with lower cortical lesion loads or primarily subpial demyelination may thus also explain the absence of cortical lesions in RRMS in our present study. Finally, primary cortical demyelination is highly specific for Figure 1 Flow chart. Over the course of 30 years, 29 individuals died (of those, 16 was related to multiple sclerosis), 12 were lost to follow-up (three abroad, four alive and five without further information available) and no MRI could be obtained in 28 participants (of whom nine were classified as CIS, eight as RRMS and 11 as SPMS). multiple sclerosis, 29 whereas MRI does detect white matter lesions that are less specific for multiple sclerosis, including leukoariosis (and other) non-multiple sclerosis lesions. 4,30 This might be especially relevant in a cohort with an older age, such as the present one.
The assessment of cortical lesions is challenging. Conventional MRI sequences are suboptimal, 19 thus results differ dependent on which sequence is used. 31 This comes with a high inter-and intrarater variability of cortical lesion assessment 4 and a reliable automatic detection method has not yet been established. We sought to minimize these issues by performing blinded consensus ratings with two experienced readers (L.H. and O.G.), with blinded expert opinion in case of different ratings (F.B.) and restriction of the analysis to obvious lesions as shown by Fig. 4.
In line with previous literature, 32 we observed a dominant effect of grey matter MTR on cognitive outcome measures, whereas grey matter atrophy was more predictive for EDSS (and T25FWT in our study). Overall, disability measures were frequently explained by grey matter related MRI measures, but additional metrics increased model performances. For example, cervical cord volume was also associated with EDSS, 9HPT, BVMTR; white matter lesions with the BVMTR; and white matter lesion MTR with the 9HPT. We observed weak gender effects for the T25FWT (Table 3). This is in line with previous work, that has suggested a dominant role for cortical involvement for prediction of current 5 and future disability, 26 but also reflects the way in which multiple sclerosis pathology at any point within a neural network can cause disability. 33 It is noteworthy that despite a highly controlled study environment and well separated cohorts with long and homogeneous disease duration, a considerable amount of variance remains unexplained by current structural imaging. Some of this uncertainty might reflect limitations inherent to clinical measurement [e.g. EDSS scores, both inter-operator and intra-operator variability contribute significantly to estimated scores 34 ]. However, it also suggests that current structural MRI sequences are relatively insensitive 35 to biologically and clinically relevant aspects of tissue injury, 36 and the networked nature of the brain. 37,38 For any given neurological or cognitive outcome, some parts of the brain, within a given individual, will play a greater part than others. 39     Thalamic atrophy has been repeatedly suggested as an early correlate of present and subsequent disability in multiple sclerosis. 40 While the group differences for thalamic volume in our study were significant comparing CIS and RRMS (P = 0.001) and CIS and SPMS (P = 0.004), RRMS compared with SPMS was far from the significance level (P = 0.828). To some extend the sample size, as discussed below, might have limited our sensitivity to detect group differences. However, given the highly significant differences compared to CIS, our findings might offer additional insights: First, that most studies comparing RRMS and SPMS have not been able to match individuals, i.e. the SPMS groups are often older with longer disease durations, and thus observed differences might be partially a function of time that separates RRMS and SPMS. Second, that thalamic atrophy reaches a floor effect already in RRMS, and that further decline in thalamic volume as a function of age (and disease duration) occurs in both SPMS and RRMS.

CIS
This study has several other limitations. We did not have a control group and it has been reported that CIS might harbour residual inflammatory damage when compared with healthy controls, 41 but this does not affect our study design. At 30 years, 28 participants (nine CIS, eight RRMS and 11 SPMS) for whom we had clinical data did not have an MRI, thus reducing our sample size and statistical power, which is relatively low when compared to other CIS cohorts. 42-44 A significant caveat is that 16 participants died due to multiple sclerosis during the follow-up, and so even among those with SPMS imaging findings will be biased towards those with a relatively benign disease evolution. Additionally, in 28 subjects with a known 30-year follow-up outcome, no MRI could be obtained (Table 1). Of those, nine were classified as CIS, eight as RRMS and 11 as SPMS. The median EDSS in CIS: 2.0 (25%-75% range: 0-2.0) and RRMS 2 (range: 0.5-2.0) was comparable to subjects who were included in the MRI analysis: CIS: 1 (25%-75% range: 0-2.0), RRMS: 1.5 (25%-75% range: 1.0-2.0). However, the EDSS scores in SPMS subjects for whom no MRI could be obtained were higher than in those with MRI [8.0 (25%-75% range: 6.5-8.5) compared with 6.0 (6.0-6.5)]. While this may have reduced our statistical power to detect group differences due to exclusion of subjects with more severe disease, it is not likely to introduce a systematic bias towards spurious differences being found. Given the average age of participants in this cohort, age-related changes   (white matter lesions in particular) and cortical atrophy cannot be robustly separated from multiple sclerosis pathology. This has the potential to complicate our assessments of clinical associations, as age-related white matter lesions may also affect clinical outcomes 45 or dilute the apparent effect of multiple sclerosis white matter lesions, 33 which is less likely to be the case for grey matter lesions. Natural ageing is thus likely to be less relevant to associations of cortical pathology with clinical outcomes, as age was homogeneously distributed between the three outcome groups [CIS: 60.5 (SD: 7.1), RRMS: 60.6 (SD: 6.4) and SPMS: 61.9 (SD: 6.7)] and included as a covariate in the statistical models. In the present study we report factors that most robustly distinguish RRMS and SPMS 30 years after CIS. However, due to the study design, we do not know to which extent, or at which time point in the evolution of the disease, such factors become relevant, or if cervical spinal cord volume, 46 or MRI lesion loads, 47 would outperform cortical involvement for prediction of conversion or disability progression rates. While we did analyse a large spectrum of structural MRI and clinical outcome measures, we could not evaluate the presence of cervical spinal cord lesions and susceptibility weighted features. Additionally, it is currently a matter of debate how to adjust cervical spinal cord volumes between individual subjects. In the present study, we corrected the cervical spinal cord volumes for the total intracranial volume, assuming, in line with previous    Stepwise selection was used to determine linear regression models that explain the different clinical outcomes via MRI biomarker including age and gender as covariates. AIC = Akaike information criterion; BPF = brain parenchymal fraction; CVLT = California Verbal Learning Test; GMF = grey matter fraction; PASAT = paced auditory serial addition test.
*P 5 0.05. **P 5 0.001. research, 6 a positive association between head size and cord volume. However, there is reason to argue for different adjustment methods, such as patient size and weight, for which we could not control. In the absence of a generally accepted correction method for spinal cord volumes, where due to the small measures even minor adjustments could potentially influence statistical outcomes, our results regarding this metric might be considered preliminary.
In conclusion, in the present cohort we found that cortical involvement was main MRI feature that distinguished SPMS from RRMS. Cortical lesions, grey matter fraction and cortical MTRs, most consistently explained neurological and cognitive impairments, indicating that this should be the target of therapeutic interventions.