Learning Patterns of the Ageing Brain in MRI using Deep Convolutional Networks

Both normal ageing and neurodegenerative diseases cause morphological changes to the brain. Age-related brain changes are subtle, nonlinear, and spatially and temporally heterogenous, both within a subject and across a population. Machine learning models are particularly suited to capture these patterns and can produce a model that is sensitive to changes of interest, despite the large variety in healthy brain appearance. In this paper, the power of convolutional neural networks (CNNs) and the rich UK Biobank dataset, the largest database currently available, are harnessed to address the problem of predicting brain age. We developed a 3D CNN architecture to predict chronological age, using a training dataset of 12, 802 T1-weighted MRI images and a further 6, 885 images for testing. The proposed method shows competitive performance on age prediction, but, most importantly, the CNN prediction errors ΔBrainAge = AgePredicted − AgeTrue correlated significantly with many clinical measurements from the UK Biobank in the female and male groups. In addition, having used images from only one imaging modality in this experiment, we examined the relationship between ΔBrainAge and the image-derived phenotypes (IDPs) from all other imaging modalities in the UK Biobank, showing correlations consistent with known patterns of ageing. Furthermore, we show that the use of nonlinearly registered images to train CNNs can lead to the network being driven by artefacts of the registration process and missing subtle indicators of ageing, limiting the clinical relevance. Due to the longitudinal aspect of the UK Biobank study, in the future it will be possible to explore whether the ΔBrainAge from models such as this network were predictive of any health outcomes. Highlights Brain age is estimated using a 3D CNN from 12,802 full T1-weighted images. Regions used to drive predictions are different for linearly and nonlinearly registered data. Linear registrations utilise a greater diversity of biologically meaningful areas. Correlations with IDPs and non-imaging variables are consistent with other publications. Excluding subjects with various health conditions had minimal impact on main correlations.

• Regions used to drive predictions are different for linearly and nonlinearly registered data.
• Linear registrations utilise a greater diversity of biologically meaningful areas.
• Correlations with IDPs and non-imaging variables are consistent with other publications.
• Excluding subjects with various health conditions had minimal impact on main correlations.
cognitive performance, ageing fitness, and mortality [15,16,17] strongly supports the idea are able to explore the independent associations with the IDPs from the other modalities.

118
Our goal is not to simply produce the lowest error on the age prediction task but to show 119 independent biological associations with the deltas. This is therefore a contribution of 120 this work beyond the current brain age literature on the UK Biobank. Furthermore, we 121 explore the effect of the registration process on the predictions, and demonstrate that the The original image dimensions were 182 × 218 × 182 voxels; however, to reduce the 141 memory needed for computation, the x and y dimensions were resized to 128 × 128 voxels, 142 and only 20 slices out of 182 were used, as can be seen in Fig. 2. To include a sufficient 143 amount of anatomical information and resolution whilst reducing redundancy, every fourth 144 slice was used from a region of 80 slices. The region was chosen so as to include as much 145 of the brain as possible while still including some information on the cerebellum, whose 146 structure and function is also impacted by ageing [42].  mean of the three network predictions. A complete schematic of both of these architectures 163 is illustrated in Fig. 3.  This, therefore, helps to make sure that the predicted ∆ BrainAge values are being driven by 169 biological differences and not the random network initialisation or stochasticity in training.

170
The final ensemble networks were trained on the full dataset with a 90%/10% train-171 ing/validation split and their performance was evaluated on a hold-out test dataset (sub-172 ject numbers can be seen in Table 1). For each network and the final ensemble networks, 173 the mean squared error (MSE), mean absolute error (MAE), and standard deviation of 174 ∆ BrainAge = Age P redicted − Age T rue were calculated (the results can be seen in Table 2).

175
Separate networks were trained and tested on the female and male subjects, as well as on 176 the linearly registered and nonlinearly warped image datasets.

177
The training of the network was implemented using the Keras (v2.

197
(a) The architecture of the fundamental network used in this work. As illustrated, each convolution is followed by batch normalization and ReLU activation. The number of filters in each of the twelve layers increases as f, f, 2f, 2f, 2f, 2f, 3f, 3f, 3f, 3f, 3f, 3f.
(b) The ensemble network architecture. Each of the networks has the same structure but independent weights, learnt from randomly initialised weights. The aggregation is completed through averaging the outputs of the three individual networks.  for each network in the ensemble are reported in Table 3 and Table 4.

247
When a high age prediction delta occurred for a subject, we compared their input images 248 to a subject of the same age who had a low age prediction delta. In Figure 5, we demon-249 strate this comparison on Subjects 1 and 2: both subjects were 55 years old, but Subject 1 250 was predicted to be 72.31 with the linearly registered data and 68.28 with the nonlinearly 251 registered data, while Subject 2 was predicted to be 55.44 with the linearly registered data 252 and 58.43 with the nonlinearly registered data. In Figure 5B and D, the same slice linearly 253 and nonlinearly registered, can be seen. Figures 5A and C show slices from the linearly 254 registered brain for comparison between the two subjects.

255
To examine which brain features are associated with subjects that have a higher predicted 256 brain age, the non-linearly registered images of all subjects predicted to be 75-80 or 45-50 257 by the CNN in each sex group were averaged to form two mean composite images ( Figure   258 6B). This process was repeated for the linearly registered images in Figure 6A. The largest 259 difference was seen around the ventricles for both types of registration.  (c) (d) Figure 4: Density plot of the predicted ages vs the true subject age. a) predictions resulting from linearly registered images for the female subjects, b) predictions resulting from linearly registered images for the male subjects c) predictions resulting from nonlinearly registered images for the female subjects, d) predictions resulting from nonlinearly registered images for the male subjects.

283
To confirm the utilty of the network we also trained the network only on healthy subjects 284 by two definitions: the first, the removal of conditions known to impact on the brain, and the 285 second, the removal of these subjects plus those with conditions we have found to correlate 286 with ∆ BrainAge -diabetes and hypertension -that had not already been removed. These

312
In this work, a CNN was used to predict brain age using 14, 226 subjects from the UK

313
Biobank as training and a further 6885 subjects for testing. The network performed well on 314 both sex groups, using either the linearly registered or nonlinearly warped images as input.

315
Using the linearly registered image dataset as input, the 3D ensemble network achieved a  the UK Biobank study, in the future we will have the ability to explore whether the deltas 426 from models such as this were predictive of any health outcomes.

428
The data is available from the UK Biobank by application. The code and weights are 429 available from the authors on request through emailing the corresponding author.

Removal of Brain Age Biases
Most brain age literature shows an underestimation of brain age for old subjects and an 597 over estimation of brain age in young subjects, resulting from several factors. This results

598
in an age dependence in the estimated deltas, which will be problematic when computing 599 association with nonimaging variables, as they will be driven by the true age rather than 600 the age delta. We therefore follow the process described by Smith et al [39] to correct for 601 both linear and quadratic associations with age.

602
We can describe our results in the form: where Y are the true age values, Y B the estimated 'biased' age values, X the array of brain 604 imaging measurements, which in our case take the form of the T1-weighted MRI images, 605 and F(.) the model mapping from X to Y B , which in our case takes the form of the CNN 606 discussed in section 2.2.

607
We first consider only linear dependence with age, which is simply removed with a second 608 step. If we consider the scatter plots such as those shown in Fig. 4 before the removal of 609 the age dependence, then we can see that ideally we would want the deltas to be distributed 610 around 0 with no overall slope and so no overall dependence on true age Y. If there is the 611 presence of an overall slope then we can simply fit a straight line to the full dataset and 612 subtract this from deltas. Therefore: where β 2 is the regressor to model the linear dependence and δ 2 are the residuals from this 614 fitting, which are orthogonal to age with the biases removed. Therefore, the predicted brain 615 age for this step is now: This therefore provides us with an estimate for the deltas with linear relationships to age 617 removed.

618
This now needs to be extended to include nonlinear relationships with age, especially 619 because the acceleration of the effects of ageing in older age seems likely, especially with 620 disease. We therefore simply adapt the model to include an additive nonlinear term in Y.

621
We simply add a quadratic term as the most natural extension and so: and where β 2q has two regressors covering the linear and quadratic regressors. Therefore, the 624 linear and quadratic relationships with age from the deltas can be removed through multiple 625 regression.

Training on Healthy Subjects
As not all of the subjects in the UK Biobank are healthy, we also trained on only 629 the healthy subjects to see what effect this had on the predictions. We did not remove 630 all patients with reported ICD9/10 codes or self-reported conditions as this left very few 631 subjects. Rather, following [55] we removed subjects with conditions that were likely to 632 impact on the brain. We filtered out subjects who had either ICD9/10 codes or self-reported 633 conditions which were likely to influence the brain. Subjects were removed who reported: 634 brain tumours, benign neoplasms of the brain, hydrocephalus/congenital malformations of 635 the nervous system, stroke/cerebrovascular disease, all neurological diagnoses and all mental 636 and behavioural disorders, leaving around 14k subjects in total. The same approach was 637 then taken as explored in the methods but only for the nonlinearly registered images, as this 638 was sufficient for comparison.

639
The MAE results and correlations can be seen below -  We also explored the effect of also removing conditions that have been identified by   Finally, we tested on the whole test dataset -healthy and unhealthy -using the model 647 trained on the healthy subjects, so that subjects with correlating conditions were also re-

Significant IDPS and Variables
For each experiment we present the significant correlations -those which passed Bonfer-663 oni correction. As too many IDPS passed the threshold to reasonably to report them all, we