Preoperative MRI brain phenotypes are related to postoperative delirium in older individuals

The underlying structural correlates of predisposition to postoperative delirium remain largely unknown. A combined analysis of preoperative brain magnetic resonance imaging (MRI) markers could improve our understanding of the pathophysiology of delirium. Therefore, we aimed to identify different MRI brain phenotypes in older patients scheduled for major elective surgery, and to assess the relation between these phenotypes and postoperative delirium. Markers of neurodegenerative and neurovascular brain changes were determined from MRI brain scans in older patients (n = 161, mean age 71, standard deviation 5 years), of whom 24 (15%) developed delirium. A hierarchical cluster analysis was performed. We found six distinct groups of patients with different MRI brain phenotypes. Logistic regression analysis showed a higher odds of developing postoperative delirium in individuals with multi-burden pathology (n = 15 (9%), odds ratio (95% confidence interval): 3.8 (1.1-13.0)). In conclusion, these results indicate that different MRI brain phenotypes are related to a different risk of developing delirium after major elective surgery. MRI brain phenotypes could assist in an improved understanding of the structural correlates of predisposition to postoperative delirium.


Introduction
Postoperative delirium is a common complication of major surgery, characterized by an acute change in attention and awareness with an additional disturbance in cognition ( American Psychiatric Association, 2013 ). Postoperative delirium has an incidence of 15%-51% during hospital admission of older patients undergoing major elective surgery, and is associated with an increased risk of adverse outcomes such as prolonged hospital stay and dementia, thereby also increasing healthcare costs ( Inouye et al., 2014 ;Marcantonio, 2017 ;Saczynski et al., 2012 ). Known risk factors for postoperative delirium include advanced age, major surgery (e.g., cardiothoracic or orthopedic), comorbidity, and preoperative cognitive dysfunction ( Inouye et al., 2014 ). However, the exact structural brain correlates related to predisposition to postoperative delirium are less clear, mainly due to the heterogeneous brain changes that are common in older patients.
Previous studies on brain magnetic resonance imaging (MRI) markers that may reflect this neural substrate have all focused on the association between one separate preoperative brain MRI marker and the occurrence of postoperative delirium ( Cavallari et al., 2016( Cavallari et al., , 2015Hatano et al., 2013 ;Hshieh et al., 2017 ;Kant et al., 2017 ;Maekawa et al., 2014 ;Otomo et al., 2013 ;Shioiri et al., 2015 ). These markers include preoperative brain volumes as markers for neurodegenerative diseases ( Cavallari et al., 2015 ;Maekawa et al., 2014 ;Shioiri et al., 2015 ), white matter hyperintensities (WMH) as a marker for small vessel disease ( Cavallari et al., 2015 ;Hatano et al., 2013 ) and brain infarcts as a marker for small and large vessel diseases ( Otomo et al., 2013 ). However, older patients most often have heterogeneous brain changes due to aging, reflecting disease processes related to both neurodegenerative and neurovascular diseases ( Vinke et al., 2018 ). Therefore, a combined analysis of these brain MRI markers could be a better representation of the substrate that predisposes to delirium. This could result in an improved understanding of the development of delirium.
We have previously developed a hierarchical clustering approach to analyze brain MRI markers in a combined way, leading to the identification of MRI brain phenotypes that were associated with a different risk of future stroke and mortality within patient with manifest arterial disease ( Jaarsma-Coes et al., 2020 ). To the best of our knowledge, no previous studies have focused on the association between distinct MRI brain phenotypes and postoperative delirium.
In the present study, we aimed to (1) identify different MRI brain phenotypes in older patients scheduled for major elective surgery, and (2) assess the relation between these MRI brain phenotypes and the occurrence of postoperative delirium.

Study sample
The present investigation is part of the BioCog Study, which is a prospective, observational study that aims to identify biomarkers for postoperative cognitive disorders . Participants for the study (1) were ≥65 years of age, (2) did not have severe cognitive impairment (a mini-mental state exam of ≥24), (3) were scheduled for major elective surgery of ≥60 minutes, and (4) were able to undergo MRI scanning . The present study included participants from 1 study center (University Medical Center Utrecht). The medical ethical committee has reviewed and approved this study under protocol number 14-469. All participants signed informed consent according to the Declaration of Helsinki.

Procedures
Participants who were scheduled for major elective surgery were invited for a hospital visit prior to surgery. The visit included questionnaires by a trained researcher (i.e., demographics, minimental state exam, functional abilities, medical history, and cardiovascular risk factors) and an MRI scan. The preoperative American Society of Anesthesiologists (ASA) score was determined by anesthesiologists (in training). After surgery, patients were screened for postoperative delirium as outlined below.

Delirium assessment
Delirium was defined according to the fifth edition of the Diagnostic and Statistical Manual of Mental Disorders (DSM-5) criteria ( American Psychiatric Association, 2013 ). To evaluate these criteria, patients were assessed postoperatively by trained researchers using a daily validated chart-review ( Inouye et al., 2005 ),as well as the Confusion Assessment Method (CAM), or Confusion Assessment Method for the Intensive Care Unit (CAM-ICU) ( Ely et al., 2001 ) and the Nu-Desc ( Gaudreau et al., 2005 ) twice daily until day 7 or until discharge, whichever occurred first.
Patients were considered delirious in case of ≥2 cumulative points on the Nu-DESC and/or a positive CAM-ICU score and/or patient chart review that showed descriptions of delirium (e.g., confused, agitated, drowsy, disorientated, delirious, receiving antipsychotic therapy).

MRI image processing
MRI image processing steps have been described previously ( Kant et al., 2019 ). In short, 3D FLAIR images were registered to the 3D T1-weighted images using statistical parametric mapping version 12 (SPM12; Wellcome Institute of Neurology, University College London, UK, http://www.fil.ion.ucl.ac.uk/spm/doc/ ) for Matlab (The MathWorks, Inc., Natick, MA). Thereafter, WMH were automatically quantified using the lesion segmentation toolbox ( Schmidt, 2017, Chapter 6.1 ( Schmidt, 2017) of the lesion segmentation toolbox version 2.0.15 ( www.statistical-modeling.de/lst. html ) for SPM12. A lesion filling method on the T1-weighted images was performed using the lesion segmentation toolbox. The filled T1-weighted images were used for brain tissue segmentation, and cortical surfaces were estimated using the computational anatomy toolbox for SPM12 (CAT12, Gaser and Dahnke, Jena University Hospital, Departments of Psychiatry and Neurology, http://www.neuro.uni-jena.de/cat/index ). All segmentations of total gray matter volume, white matter volume, cerebrospinal fluid and WMH were visually checked by trained researchers and in doubt by a neuro-radiologist (JB). Mean cortical thickness was estimated per region of the DK-40 atlas ( Desikan et al., 2006 ). WMH volumes were thresholded, and distinguished per brain lobe as deep, periventricular or confluent. WMH shape markers (solidity, convexity, concavity index, fractal dimensions, eccentricity) were calculated for deep and periventricular or confluent lesions according to an in-house developed method ( Ghaznawi et al., 2019 ;Kant et al., 2019 ). Perfusion images were analyzed using the Ex-ploreASL toolbox ( Mutsaerts et al., 2014 ), resulting in gray matter perfusion, white matter perfusion and the spatial coefficient of variation (CoV).

Distinguishing MRI brain phenotypes by hierarchical cluster analysis
The brain MRI markers that were included in the cluster analysis were brain volumes (total brain volume fraction, gray matter volume fraction, white matter volume fraction, peripheral CSF fraction, ventricular CSF fraction, mean cortical thickness per region of the DK-40 atlas), WMH (deep WMH volume per lobe, confluent and periventricular (CP) WMH volume per lobe, convexity, solidity, concavity index, fractal dimension of confluent and periventricular lesions, fractal dimension and eccentricity of deep lesions), brain infarcts (number of cortical infarcts, cortical infarct volume, number of lacunar infarcts) and perfusion (gray matter perfusion, white matter perfusion, spatial CoV). Normally distributed variables were expressed using a Z-score. Non-normally distributed variables were scaled to a range between -2 and 2, by normalizing each value (x) between the new minimum (a) and where a is equal to -2, and b is equal to 2.
Hierarchical clustering was performed using Ward's method in R version 3.5.1( R Core Team, 2018 ) and packages Nbclust ( Charrad et al., 2014 ), factoextra ( Kassambara, 2017 ), cluster ( Maechler et al., 2019 ), and dendextend ( Galili, 2015 ). Hierarchical clustering is a method to distinguish groups (clusters) based on the distances between a set of variables. These clusters are organized as a tree that starts with every patient as a separate cluster, and then repeatedly merges the 2 closest clusters, updating the distance matrix. Therefore, all clusters are a union of 2 subclusters, leading to a hierarchical organization. This was repeated until one group (the total group of patients) remains. This approach can be visualized as a dendogram (see left y-axis of Fig. 2 for an example). To determine the number of groups that is used for further analysis, the dendogram needs to be cut at a certain level.
In an optimally clustered sample, the clustered data have a high within cluster cohesion, and a high separation between different clusters. This can be determined using the dunn index (ratio of the smallest distance between observations in different clusters, to the largest between cluster distance), which needs to be maximized. It can also be determined by the heatmap that plots all variables per group of patients. In the current analysis, both methods were used to estimate the optimal number of groups.

Statistical analysis
Between-group differences in demographics were assessed using a χ 2 test for categorical variables, and a 1-way ANOVA for continuous variables. Between-group differences of the brain MRI markers were assessed by 1-way ANOVA analyses. These analyses were adjusted for multiple comparisons by a false discovery rate correction. Logistic regression analysis was performed to assess the relation between the groups with different MRI brain phenotypes and postoperative delirium. All groups were entered to the same model and compared to the reference group by a single unadjusted logistic regression analysis with postoperative delirium as the dependent variable. A p value of < 0.05 was considered statistically significant.

Data availability
The datasets generated and analyzed during the current study are not publicly available as this is a substudy of a still ongoing consortium study, but may be available from the corresponding author on reasonable request.

Study sample
A total of 161 participants (mean age 71, standard deviation (SD) 5 years) were included for the hierarchical cluster analysis with 95 distinct brain MRI markers of neurovascular and neurodegenerative diseases. See Table 1 for an overview of the demographics of the total group and Fig. 1 for a flowchart of the inclusion of participants.

MRI brain phenotypes
The hierarchical cluster algorithm resulted in the dendogram and heatmap shown in Fig. 2 . Based on both the dunn index (supplementary Fig. 1) and the heatmap, the optimal cut-off was determined at 6 different groups with distinct MRI brain phenotypes. These groups consisted of 34 (group 1; limited burden, 21%), 39 (group 2; limited burden, 24%), 30 (group 3; limited burden, 19%), 34 (group 4; mainly atrophy, 21%), 9 (group 5; mainly atrophy and  Heatmap of the hierarchical clustering algorithm. Every row of this figure represents one participant. Every column represents one brain MRI feature. Blue represents a low value, white represents a value around zero and red represents a high value. The left side of the image shows the hierarchical clustering tree dendogram with the separate groups, respectively in red (atrophy and SVD, n = 9), yellow (multiburden, n = 15), blue (mainly atrophy, n = 34), purple (limited burden, n = 34), green (limited burden, n = 30), and dark red (limited burden, n = 39). For example, the blue values in the red group on top represent a relatively low cortical thickness (more atrophy). Another example can be seen in the yellow group, as the right part of the heatmap shows a relatively high WMH burden and a relatively high concavity index (CI) in this group. (For interpretation of the references to color in this figure legend, the reader is referred to the Web version of this article.) SVD, 6%) and 15 (group 6; multiburden, 9%) patients. Table 2 shows an overview of the main between-group differences in brain MRI markers (for a full list of brain MRI markers that were used in the model, see Supplementary Table 1). Each group had a distinct pattern of brain MRI markers that have driven the distinction made by the hierarchical clustering algorithm, representing different combinations of neurodegenerative and neurovascular brain changes. The mean age of patients in each group ranged from 68.9 ± 3.2 (mean ± SD) to 75.4 ± 6.4 years. The "limited burden" groups showed the least brain MRI changes related to neurovascular and neurodegenerative diseases. There were only small differences in brain changes between the "limited burden" groups, but the most pronounced differences were the slightly larger amount of GM atrophy and slightly higher number of cortical brain infarcts in group 2 and 3 compared to group 1 (group 1; GM volume (% ICV, mean ± SD): 41.1 ± 1.4, number of cortical brain infarcts (mean ± SD): 0.1 ± 0.2, group 2; GM volume (% ICV, mean ± SD): 39.8 ± 1.6, number of cortical brain infarcts (mean ± SD): 0.4 ± 0.7, group 3; GM volume (% ICV, mean ± SD): 39.2 ± 1.2, number of cortical brain infarcts (mean ± SD): 0.4 ± 1.0). The "mainly atrophy" group had an increased overall disease burden of mostly neurodegenerative origin (group 4; GM volume (% of ICV, mean ± SD): 38.1 ± 1.8). The "mainly atrophy and SVD" group showed a high SVD and global atrophy burden (group 5; WMH volume: (mean ± SD) 26.6 ± 17.1, GM volume (% ICV, mean ± SD): 35.5 ± 1.9), and the "multi-burden" group showed an overall high disease burden with mostly MRI markers of neurovascular diseases, and the highest number of brain infarcts in comparison to other groups (group 6; WMH volume (mean ± SD): 23.4 ± 17.8 ml, number of cortical brain infarcts (mean ± SD): 1.7 ± 3.4, GM volume (mean ± SD): 37.8 ± 2.0). The groups differed significantly on almost all brain MRI markers that were used in the hierarchical clustering algorithm ( Table 2 and Supplementary  Table 1). Table 3 shows the patient demographics of the different groups. The groups differed significantly in age, preoperative ASA scores (with the highest percentage of ASA 3 score in the multiburden group (group 6; n = 9 (60%)), hypertension (with the highest percentage of patients with hypertension in the mainly atrophy and SVD group (group 2; n = 8 (89%))), and previous stroke/TIA (with the highest percentage of patients with a previous event in the multiburden group (group 6; TIA: n = 3 (20%), stroke: n = 5 (36%)).

Association with postoperative delirium
A total of 24 patients developed postoperative delirium (15%). The percentage of patients with postoperative delirium differed per group from 3% to 36% ( Table 3 ). Due to the small number of delirium cases in the reference groups (i.e., the groups with the least amount of brain abnormalities), the groups with limited disease burden were chosen as a combined reference group (groups 1, 2, and 3, respectively) for this analysis only. Logistic regression analysis showed a higher odds of developing postoperative delirium in the "multi-burden" group (OR (95% CI): 3.8 (1.1-13.0)). No association with postoperative delirium was found in the "mainly atrophy and SVD" group (OR (95% CI): 1.0 (0.1-8.5)) or the "mainly atrophy" group (OR (95% CI): 1.3 (0.4-3.8) Figure 3 .

Discussion
We showed that distinct MRI brain phenotypes can be identified in older patients who are scheduled for major elective surgery. Furthermore, we found a higher odds of developing postoperative delirium in patients with multiburden brain pathology.
Recent developments in machine learning techniques have enabled analysis of patterns using novel clustering methods. Identification of different MRI brain phenotypes can lead to novel insights into the neural correlates of predisposition to delirium. Our results revealed six distinct subgroups of patients with different distributions of brain MRI markers of neurodegenerative and neurovascular diseases. We have shown that a multiburden MRI brain phenotype (e.g., small vessel disease (SVD), large vessel disease (LVD) and atrophy) may predispose to developing postoperative delirium. The difference between this multiburden MRI brain phenotype and other phenotypes in the presence and volume of cortical infarcts is striking (Supplementary Table 2), and could have partly driven the association between the "multi-burden" MRI brain phenotype and delirium. However, as shown in a recent study within the same dataset ( Kant et al., 2021 ), the presence of cortical infarcts alone does not show a strong association with postoperative delirium. Interestingly, it therefore seems that multiple forms of brain pathology are required to increase the risk of postoperative delirium. Our study also provides further evidence for the hypothesis that surgery and anesthesia act as a stress test, which increases the risk of developing delirium in vulnerable patients (i.e., patient with brain MRI changes related to neurodegenerative or neurovascular diseases).
Future steps that need to be taken to improve our understanding of the relation between preoperative MRI brain phenotypes and postoperative delirium include identification of the (combination of) brain MRI markers that are driving the increased risk of delirium. Our study can act as a guide for brain MRI markers selection for future machine learning studies on increased delirium risk, starting with selecting the MRI markers that differed most between groups. Future studies on delirium are also encouraged to confirm our findings by validating machine learning methods in other cohorts of surgical patients. To increase comparability between research cohorts, image acquisition and processing methods should be standardized and fully automated by implementation of a standard image processing pipeline, which is tested for accuracy and robustness. After these steps have been undertaken, identification of MRI brain phenotypes might also be used as a personalized risk assessment tool for adverse postoperative outcomes in patients that already had a brain MRI.
To the best of our knowledge, our study is the first to assess preoperative MRI brain phenotypes in relation to postoperative delirium. Strengths of our study include the use of multiple brain MRI markers in one framework. Furthermore, we mostly included markers that can be (semi-)automatically detected on brain MRI scans, using state-of-the-art quantification techniques based on publically available software (e.g., CAT12). This increases the possibility of future standardization and implementation in other studies. Our method performed an automated, unsupervised approach to identify groups, possibly leading to new combinations of brain MRI markers and novel insights that might not have emerged with a conventional approach. As this is an explorative study, no power analysis was conducted. Limitations of our study include that our method has some settings that may seem arbitrary or subjective, such as the number of groups or the brain MRI markers that were used. However, we increased objectivity by using the heatmap and the dunn index for the choice of the number of groups, and by choosing validated brain MRI markers that can almost all be automatically quantified. Furthermore, we aimed to describe our choices in a transparent way, enabling reproducibility. Another limitation may be the limited number of patients with postoperative delirium, even though we performed an extensive delirium screening protocol, incomplete detection cannot be ruled out. The limited number of patients with postoperative delirium could be the result of improved postoperative care such as activation and mobilization ( Reuben et al.,20 0 0 ). Due to the limited number of delirium cases in the group with the least amount of disease burden, we had to combine reference groups. This problem would also occur if we would have taken 2 groups with the least amount disease burden. Therefore, we followed another approach by combining the three reference groups that had the least amount of brain abnormalities. Replication and external validation of the results of our study in a larger sample is therefore encour- Fig. 3. The association between MRI brain phenotypes and postoperative delirium. Odds ratios are shown with a 95% confidence interval.
aged. Another limitation is that a relatively low number of patients agreed to participate in the current study ( Fig. 1 ). Possible explanations for this are the extensive study protocol including multiple hospital visits , and the population that consisted of older, often frail patients . Furthermore, we have included patients with diverse types of surgery, which may have influenced in itself the risk of postoperative delirium per group. Due to the limited number of patients per group, we could not adjust for type of surgery in our analysis.
In conclusion, we have shown that different MRI brain phenotypes can be identified in older patients who are scheduled for major elective surgery. Our results may indicate that different MRI brain phenotypes are related to a different risk of developing postoperative delirium. MRI brain phenotypes could assist in an improved understanding of the structural correlates that predispose individuals to postoperative delirium.

Disclosure statement
The authors have no actual or potential conflicts of interest.

Supplementary materials
Supplementary material associated with this article can be found, in the online version, at doi: 10.1016/j.neurobiolaging.2021. 01.033 .