Relationships between motor and cognitive functions and subsequent post-stroke mood disorders revealed by machine learning analysis

Mood disorders (e.g. depression, apathy, and anxiety) are often observed in stroke patients, exhibiting a negative impact on functional recovery associated with various physical disorders and cognitive dysfunction. Consequently, post-stroke symptoms are complex and difficult to understand. In this study, we aimed to clarify the cross-sectional relationship between mood disorders and motor/cognitive functions in stroke patients. An artificial neural network architecture was devised to predict three types of mood disorders from 36 evaluation indices obtained from functional, physical, and cognitive tests on 274 patients. The relationship between mood disorders and motor/cognitive functions were comprehensively analysed by performing input dimensionality reduction for the neural network. The receiver operating characteristic curve from the prediction exhibited a moderate to high area under the curve above 0.85. Moreover, the input dimensionality reduction retrieved the evaluation indices that are more strongly related to mood disorders. The analysis results suggest a stress threshold hypothesis, in which stroke-induced lesions promote stress vulnerability and may trigger mood disorders.


Materials and methods
Participants. We used clinical data obtained from 274 stroke inpatients (age: 64.9 ± 10.7 years) at the Hibino Hospital, who could perform psychological and cognitive function tests. All patients provided informed consent. They were admitted to the Kaifukuki Rehabilitation Ward, where the inpatients were hospitalised (admitted) within two months of onset after acute treatment for stroke; rehabilitation was performed for the inpatients for up to 180 days and up to 3 hours a day. The patients under treatment of major psychiatric illnesses, such as major depression, bipolar disorder, schizophrenia, or schizoaffective disorder, were excluded (in this study, one patient had a history of autonomic imbalance, and one had a history of insomnia/neurosis, but underwent the treatment and was treatment-free on admission). The type of stroke was haemorrhage or occlusive stroke (infarction and transient ischaemic attack; TIA). Infarction in one patient was associated with mild subarachnoid haemorrhage. The study was approved by the Ethics Review Committee of the Hiroshima University Epidemiological Research and the Ethics Review Committee of the Shinaikai Hibino Hospital, and was performed in accordance with relevant guidelines and regulations.

Assessment of cognitive function. Cognitive function was examined using the Mini-Mental State
Examination with scores ranging from 0 to 30 and the Trail Making Test. Attention deficit was systematically evaluated using the Clinical Assessment of Attention Deficit, as described previously 18 along with another Trail Making Test. Spatial neglect was examined using the Behavioural Inattention Test, and memory was examined using the Rivermead Behavioural Memory Test. The tasks analysed to assess cognitive function are listed in Table 1.
Measurements of stroke severity. The Functional Independence Measure (FIM) version 3.0 contains 18 items (13 motor and 5 cognitive items) that comprise an observer-rated summed rating scale for evaluating disability in terms of dependency (the lower the score, the greater the disability). The FIM is widely used to quantify disability in stroke patients 19 . Hence, all patients were examined for disability using the FIM within a week after admission and at discharge. The FIM improvement rate was calculated as follows: [(FIM score on discharge)−(FIM score on admission)]/[period of hospitalisation (weeks)].
Motor impairment in hemiplegic stroke patients was measured using the Brunnstrom Recovery Scale (BRS), wherein movement patterns were evaluated in the upper limb, fingers, and lower limb, and motor function was evaluated according to the stages of motor recovery 19 . The scale defines recovery only in broad categories, which correlate with those of progressive functional recovery (the lower the score, the greater the disability). The following analysis was performed by summing the stages of BRS of the upper limb, fingers, and lower limb.
The presence or absence of ataxia and aphasia was evaluated at admission. Lesion location of infarction was assessed using magnetic resonance imaging (MRI) and that of haemorrhage was assessed using MRI or computed tomography (CT), and categorised into brainstem, cerebellum, right or left basal ganglia, right or left subcortical, and right or left cortical.
Psychological assessment. The Hospital Anxiety and Depression Scale (HADS) was used to identify depression and anxiety, and the apathy score was used to identify apathy. We derived HADS-Depression and HADS-Anxiety scores using the HADS, and patients with HADS-Depression and HADS-Anxiety scores above 9 were classified as having PSD and anxiety, respectively. In addition, patients were adjudged to have apathy when they had an apathy score above 16. To assess stress, we used the Japanese version 20 21 . This scale is widely used to measure the degree to which situations in a subject's life are appraised as stressful.
Proposed machine learning approach. To analyse the relationship between mood disorders and motor/ cognitive functions, we used a probabilistic artificial neural network called log-linearized Gaussian mixture network (LLGMN) 22 . This network enables the estimation of the statistical distribution of sample data based on machine learning and the prediction of the posterior probability of the class for unknown input data. We propose a mood disorder identification model composed of three LLGMNs, as illustrated in Fig. 1. We independently predicted the posterior probabilities of each mood disorder, namely, PSD, apathy, and anxiety. The input to each LLGMN is a P-dimensional evaluation index, P ] T ∈ R P , obtained from the eight abovementioned evaluation tests, where n identifies the patient. The output is a two-dimensional posterior probability vector, Y (n) r ∈ R 2 , representing the absence or presence of a mood disorder, with r = 1, 2, 3 indicating PSD, apathy, and anxiety, respectively.
We first divided the 274 patients into four groups: control, depression, apathy, and anxiety groups ( Table 1). The machine learning analysis was conducted for each combination of the control group and mood disorder groups. The training dataset comprised evaluation indices z (n) ( n = 1, 2, . . . , N ) of N patients from each combination as training inputs, and the corresponding labels (absence/presence of mood disorders) Q (n) r ∈ R 2 . The proposed model was trained using error backpropagation, and the prediction accuracy was verified using the validation dataset composed of the data excluded from the training dataset. Posterior probabilities Y (n ′ ) r of each mood disorder were predicted by inputting validation inputs z (n ′ ) ( n ′ = 1, 2, . . . , N ′ ) to the model. The evaluation accuracy was then evaluated using the area under the curve (AUC) from the receiver operating characteristic (ROC) curve on the predicted posterior probability Y Input dimensionality reduction using partial Kullback-Leibler information. Estimating the statistical distribution of a high-dimensional input space (z(n) ∈ R P ) may lead to suboptimal solutions reflecting local minima. In addition, it is not possible to clarify the relationship between each evaluation index and mood disorder simply by predicting the absence or presence of the mood disorder using the LLGMN. Therefore, we reduced the input dimension using the partial Kullback-Leibler (KL) information measure 23 and identified the most relevant indices related to each mood disorder.
The partial KL information measure is defined as r, [I+ī] are vectors representing the posterior probability distributions of the classes predicted by inputting the evaluation index vector with these dimensions reduced, and I r (Q, Y) is the KL information between arbitrary probability distributions Q and Y . The input dimensionality reduction proceeds as follows.
1. The number of reduced dimensions is initialised as d = 0 ( D = P ), and the reduction dimensions is set as an empty set ( I=φ). [I+ī] ∈ R D−1 from which I +ī has been deleted is inputted to the LLGMN. The KL information measure www.nature.com/scientificreports/ 4. The dimension maximising the partial KL information ī max = arg max¯i ∈Ī E r,[I+ī] is obtained using Eq. (1), and this dimension is added to I as a new reduction dimension. 5. After setting d + 1 as a new reduced dimension d, steps 2 to 4 are repeated until d = P − 1.

The evaluation index vector z
Following the above procedure, the model with the largest AUC is adopted for prediction.

Relationship between evaluation indices and mood disorders. The proposed machine learning
approach based on the LLGMN was evaluated using the dataset obtained from the 274 patients. The dataset was composed of the 36-dimensional evaluation index vector containing the results of the evaluation tests and the corresponding absence/presence of the mood disorder determined by the HADS and apathy scores. The input dimensionality reduction using the partial KL information enabled the extraction or representative indices for predicting PSD, apathy, and anxiety. Then, the ROC curve was obtained from the posterior probability of each mood disorder predicted by the LLGMN and labels (absence/presence of mood disorder). The prediction accuracy of mood disorders was evaluated using the AUC obtained by ten-fold cross-validation.
We compared the prediction accuracy of the proposed model with the reduced input dimension against three classification models: stepwise multiple linear regression, logistic regression, and partial least squares (PLS) regression. In the stepwise multiple linear regression, variables were selected using a forward-backward stepwise selection method. All variables were used in the logistic regression and PLS regression; the number of latent factors in the PLS regression was set to 3. The sensitivity, specificity, positive predictive value (PPV), and negative predictive value (NPV) that provided the maximum AUC were also calculated and compared with those of the proposed method. In this experiment, the positive (presence of mood disorder) and negative (absence of mood disorder) data included in the dataset were balanced throughout the analyses to eliminate the bias due to the mood disorder incidence.
Finally, we analysed and compared the decrease in AUC when one input dimension was disregarded from the input indices after dimensionality reduction using the partial KL information. This analysis enabled us to rate the importance of each evaluation index for the considered mood disorders.

Statistical analysis. The differences in control and psychiatric grouping (depression, apathy, and anxiety)
were assessed using the χ 2 test for categorial values and the Kruskal-Wallis analysis for continuous values. A post-hoc test was conducted based on the Steel-Dwass test. Values were considered to be significant at p < 0.05 . JMP Pro 14.2.0 (SAS Institute Inc., Cary, NC, USA) was used for the analyses. To compare the AUC of the proposed method with those of other methods, a pairwise comparison with the proposed method was performed using the DeLong test 24 with Holm adjustment. The DeLong test is a statistical test method for comparing two AUCs and is widely used owing to its non-parametric approach.

Results
Baseline structures. Table 1 presents the baseline data for stroke patients categorised into control, depression, apathy, and anxiety groups. Cognitive FIM was significantly lower in the mood disorder group. Age, the hospitalisation period, and physical disabilities (paralysis, ataxia, aphasia) were not significantly different in each group. In addition, in the presence of a mood disorder, JPSS was high and several cognitive functions were impaired.

ROC analysis.
The results of ROC analysis obtained from the comparison of the proposed model with three linear classification models are depicted in Fig. 2. The ROC curves of each method are overwrapped for each group (Fig. 2a). The proposed model with reduced input dimensionality revealed an AUC above 0.85 for all mood disorders, indicating its suitable classification accuracy, which reaches an AUC above 0.90 for PSD and anxiety (Fig. 2b). Overall, the AUC of the proposed model with reduced input dimensionality was the highest for all mood disorders among the evaluated models. The evaluation measures for each method are presented in Table 2.
We removed the indices one by one to evaluate the effect of the missing index on the classification accuracy of the proposed LLGMN model. Specifically, a removed index retrieving a large drop in accuracy would indicate a high contribution to mood disorder identification. The results from this evaluation are presented in Tables 3-5. The number of input indices after input dimensionality reduction was 11 for PSD, 14 for apathy, and 9 for anxiety. Consider Fig. 2 showing the AUC for PSD (0.949), apathy (0.850), and anxiety (0.950). For PSD, removing the JPSS, wrong answers in SDMT, and digit span backward results reduced the AUC by 20.1%, 9.82%, and 8.17%, respectively. For apathy, removing the JPSS, digit span backward, and tapping span backward results reduced the AUC by 15.0%, 6.97%, and 4.74%, respectively. For anxiety, removing the JPSS, digit span backward, and motor FIM on admission results reduced the AUC by 20.5%, 13.5%, and 10.3%, respectively.

Discussion
We devised a machine learning approach to analyse the relationship between post-stroke mood disorders and indices obtained from functional evaluation tests. We confirmed that the proposed model could predict poststroke neuropsychiatric symptoms (i.e. PSD and anxiety) with moderate to high accuracy, with an AUC above 0.85 for all the evaluated mood disorders (see Fig. 2). The classification characteristics of each method are summarised in Table 2, indicating that the proposed method can classify both negative and positive data with a relatively good balance. Therefore, the proposed non-linear model effectively predicts post-stroke neuropsychiatric symptoms and outperforms traditional linear classification.
Scientific Reports | (2020) 10:19571 | https://doi.org/10.1038/s41598-020-76429-z www.nature.com/scientificreports/ PSD is widely thought to be associated with stroke severity and the degree of physical and cognitive impairment. In Table 1, the many cognitive function tests can be seen to be lower in depression, apathy, and anxiety groups than in control group. In addition, considering the severity after stroke, cognitive FIM was lower in the presence of a mood disorder at the time of admission and discharge. Cerebrovascular lesions, which are   www.nature.com/scientificreports/ associated with depression or cognitive impairment through related mechanisms, result in poor prognosis for PSD patients [3][4][5]25 . It is believed that the presence of PSD interferes with ADL due to cognitive dysfunction.
To examine the role of psychosocial stressors as risk factors in psychological illnesses (i.e. depression or anxiety), the impact of an "objectively" stressful event should be determined by one's perceptions of their stressfulness 21 . Cohen et al. developed the perceived stress scale, which is one of the most commonly used scales to measure the degree to which situations in one's life are appraised as stressful 20,21 . Our results revealed that post-stroke neuropsychiatric symptoms are correlated with JPSS scores, suggesting that post-stroke mood disorders are associated with mental stress. However, our results also demonstrated a weak relation between PSD and anxiety and the severity of physical impairment (paresis measured obtained using the BRS). It may not always be as simple as when the symptoms are severe, the mental stress increases, leading to the easy onset of depressed. This is because even when stress is applied, patients tend to deal with the stress to prevent depression; however, if they are vulnerable to stress due to stroke (threshold hypothesis), the introduction of a sudden and unpredictable life-threatening stressor called stroke could potentially lead to mood disorders 5,17 . Thus, the perceived stress significantly affects post-stroke neuropsychiatric symptoms over objective stress measures.
The aetiology of the post-stroke mood disorder (depression, apathy, and anxiety) is believed to be multifactorial and is poorly understood 16 . Additionally, cognitive impairment, stroke severity, and physical disability have been the most consistently identified associated factors 2,16 . In this study, we attempted to predict post-stroke mood disorders using machine learning by inputting the abovementioned factors, and obtain high prediction accuracies for cases of depression, apathy, and anxiety. Currently, a diagnostic kit for major depression is used to diagnose PSD; however, unlike major depression, PSD is characterised by variation and different pathological conditions, and hence an accurate diagnosis is infeasible 16 . PSD is therapeutically resistant in comparison with major depression 2,3 , and a more detailed diagnosis of PSD, such as depressed mood, decreased motivation, and anxiety, is beneficial for treatment 16 .

Conclusion and limitations
In conclusion, we found that post-stroke neuropsychiatric symptoms (i.e. PSD and anxiety) may be suitably identified using LLGMN based on test scores obtained from stroke patients. Furthermore, we evaluated the index contribution to each neuropsychiatric symptom using the partial KL information measure. This study is the first step in aiming to accurately diagnose PSD using data obtained in routine practice without any special equipment.
The degree of depression, apathy, and anxiety observed in this study was relatively mild in comparison with that typically observed in patients with major depression. Moreover, patients with severe comprehension deficits who could not perform the cognitive function tests were excluded from this study. Thus, these results may not be applicable to all stroke patients. To categorise the psychiatric grouping, we used simple screening tools; however, more in-depth assessment tools are desired and will considered in the future study to improve the accuracy of the diagnosis. Moreover, we intend to conduct detailed studies using MRI images to elucidate the aetiology and improve diagnostic techniques of PSD.

Data availability
The datasets generated and/or analysed in the current study are available from the corresponding author upon reasonable request.