Positive impact of short-term gait rehabilitation in Parkinson patients: a combined approach based on statistics and machine learning

Parkinson’s disease is the second most common neurodegenerative disorder in the world. Assumed that gait dysfunctions represent a major motor symptom for the pathology, gait analysis can provide clinicians quantitative information about the rehabilitation outcome of patients. In this scenario, wearable inertial systems for gait analysis can be a valid tool to assess the functional recovery of patients in an automatic and quantitative way, helping clinicians in decision making. Aim of the study is to evaluate the impact of the short-term rehabilitation on gait and balance of patients with Parkinson’s disease. A cohort of 12 patients with Idiopathic Parkinson’s disease performed a gait analysis session instrumented by a wearable inertial system for gait analysis: Opal System, by APDM Inc., with spatial and temporal parameters being analyzed through a statistic and machine learning approach. Six out of fourteen motion parameters exhibited a statistically significant difference between the measurements at admission and at discharge of the patients, while the machine learning analysis confirmed the separability of the two phases in terms of Accuracy and Area under the Receiving Operating Characteristic Curve. The rehabilitation treatment especially improved the motion parameters related to the gait. The study shows the positive impact on the gait of a short-term


Introduction
Parkinson's Disease (PD) is a neurodegenerative, progressive and age-related disorder, which usually starts between age 30 and 60. It occurs in about 1% of the population aged over 60 and its prevalence increases in the elderly. Around 20% of people over 80 have parkinsonism, a clinical syndrome characterized-in various combinations-by tremor, bradykinesia, rigidity, and postural instability [1]. PD is the second most common neurodegenerative disease after Alzheimer disease, and may be less common in population of Asian or African origin, and more common in men than in women [2]. As the life expectancy of the world population increases, prevalence and incidence of PD are expected to double by 2030 [3]. Neuropathological hallmarks of PD include selective loss of dopaminergic neurons in the Substantia Nigra pars compacta of the central nervous system [4], the absence of dopamine in the circuit of the basal ganglia leading to the loss of automatic gait [5] and the presence of Lewy bodies containing alpha-synuclein in several brain regions [6]. PD clinical features include both motor symptoms, such as bradykinesia, akinesia, freezing, resting tremor, rigidity, gait and stability impairment [7,8], and non-motor symptoms, such as cognitive impairment, depression, anxiety, psychosis and constipation. However, gait and posture disorders remain the most discriminating features of the pathology [7,9]; in fact, gait dysfunctions represent a major motor symptom in PD and have been associated with an increased risk for falls and immobility, which in turn contributes to greater disability, institutionalization with consequent increases in healthcare costs, and, ultimately, death [10]. As the disease progresses, these gait disorders become more pronounced; in fact, initially these can disable patients and severely limit their quality of life [11], while later these can lead to rapid loss of independency and reduced survival between 5.3 to 9.7 years from the onset of symptoms [12].
In this scenario, the evaluation of PD patients gait analysis represents a valid tool to monitor both the evolution of the disease and the improvements following pharmacological therapy and/or functional rehabilitation in terms of gait and posture in a quantitative way. Conventional gait analysis records spatiotemporal and kinematics parameters of the gait cycle during a specific gait protocol; this strategy results a quick and reliable tool to measure walking and balance performances of patients. Several instruments have been adopted over years to effectively collect spatiotemporal and kinematics parameters during patients' gait. Undoubtedly, the gold standard has been represented by threedimensional motion capture systems and force plates; nevertheless, these instruments have demonstrated expensive to acquire and operate and, therefore, often unfeasible for clinical use. Thus, there have been the necessity to find and study alternative low-cost solutions; in this scenario, previous studies have demonstrated e-textile socks [13][14][15], Kinect TM [16], Wii Fit [17] and even webcamswhich demonstrated useful for clinicians to perform quantitative assessments, despite their intrinsic limitations-allowed to achieve significant results. Nevertheless, even the use of wearable inertial systems for gait analysis-which are spreading in several fields [18][19][20][21], included the rehabilitation setting [22][23][24]-have demonstrated as an alternative cost-effective solution. In fact, the development of inertial measurement units (IMUs) for kinematic assessments have been a major technological advancement in both the biomechanics and wearable sensors fields. In this scenario, IMUs have demonstrated a set of significant advantages, e.g., IMUs are relatively inexpensive, allow a virtually unlimited number of steps to be evaluated, present a lower complexity of the experimental setup and reduce the time required for the examinations. These advantages explain the rapid spreading of these devices even in the clinical field. In recent years, the assessment of the rehabilitation outcome of PD patients using the-previously described-advanced instruments has been establishing in clinical practice; it is not by chance if the traditional, subjective and empirical assessments based on clinical scales has lowering their importance in favor of strategies which allow e.g. to evaluate quantitatively, eventually through gait analysis instrumentations, the kinematic parameters able to support clinicians in the decision making. Moreover, artificial intelligence-through the use of machine learning (ML) algorithms-is spreading in the context of the clinical research and, more specifically, in the gait analysis field. In this scenario, previous research aimed even to distinguish motor disorders by means of gait analysis and ML. A recent solution has been proposed by De Vos et al. [25] who used a wearable inertial system and ML algorithms to discriminate progressive supranuclear palsy from PD.
Although several contributions in the field have addressed the evaluation of the rehabilitation outcome for PD patients, to the authors' best knowledge, instead, no studies aimed at investigating if the effect-demonstrated by kinematic quantitative evaluations-of short-time rehabilitation programs could effectively promote objective improvements on both the gait and balance performances of PD patients. Therefore, in this paper we statistically assessed the rehabilitation power-evaluating the potential mutual quantitative differences of several kinematic parameters extracted using a wearable inertial system-of a potential 2-months rehabilitation program for PD patients. Moreover, we even performed-as counterchecks-several ML analyses. Specifically, we verified the overall improvement of patients studying the degree of separability of the two classes-hospitalization and discharge-starting from the statistically significant kinematic features computed by the wearable system. Moreover, through a feature importance analysis we studied the most informative and predictive kinematic features following the short-term rehabilitation.

Wearable inertial system for gait analysis
A commercial wearable inertial system for gait analysis-Opal System by APDM Inc.-was used in this work [26][27][28]. Opal System is composed of 3 movement monitors, each including a 3 axes accelerometer with 14 bits resolution (selectable on different ranges on the basis of the specific use), a 3 axes gyroscope with 16 bits resolution and a 3 axes magnetometer with 12 bits resolution. Movement monitors or Opal sensors are attachable on subjects using a selection of straps and wirelessly connected by Bluetooth 3.0 to a remote laptop running the Mobility Lab software, which is able to process all movement data and compute the main kinematic parameters through native algorithms. Moreover, the Docking Station allows to charge and configure Opal sensors while the Access Point makes possible the communication between the sensors and the laptop ( Figure 1).  Forty-two PD patients undergoing a rehabilitation treatment at the Institute of Care and Scientific Research of Telese Terme (BN) in Italy were consecutively recruited for the study. Of these, only 12 patients took part in the study, due to the compliance to the following criteria: 1) Inclusion Criteria: age between 50 and 80 years old; diagnosis of idiopathic PD; adequate compliance and sufficiently stable response to therapy, ability to perform a gait analysis session without any support or interruption.

Study Population
2) Exclusion Criteria: secondary Parkinsonism; presence of severe cognitive impairment. The patients were analyzed through gait analysis instrumentation at the beginning and at the end of the hospitalization (2 months) in order to analyze their rehabilitation outcome in terms of gait and balance through the calculation of kinematic parameters.
All the clinical characteristics of the patients' cohort are shown in Table 1.
All the patients gave the informed consent. The local Ethics Committee approved the study, which was performed in accordance with the Declaration of Helsinki.

Therapeutic interventions
The patients were assessed within 48 hours from the admission by a neurologist or a physiatrist. The physicians and the nurses collected the demographic and clinical data. All the patients but one were on levodopa and/or dopa-agonist pharmacological treatment (range of Levodopa Equivalent Daily Dose: 0-1464 mg). Following the admission evaluations, the multidisciplinary team defined the therapeutic plan, delivered in 2 hours per day sessions, 6 sessions/week, during 6-8 weeks of staying at the rehabilitation department.
The individually tailored neurorehabilitation sessions included physiotherapy sessions (cardiovascular warm-up activities, relaxation exercises, muscle stretching, exercise to improve the range of motion of spinal, pelvic and scapular joints, exercises to improve the functionality of abdominal muscles and postural changes in the supine position, exercise to improve balance and gait), speech and swallowing therapy, occupational therapy (transfers from sitting to standing, rolling from supine to sitting and from sitting to supine, dressing, use of tools, and exercises to improve hand functionality and skills) and neuropsychology (cognitive stimulation programs aimed at enhancing the cognitive and social functioning of each patient).

Study protocol
All the patients underwent a gait analysis session instrumented by the Opal system with two Opal sensors attached to each shin by Velcro straps and one Opal sensor attached to the low back through a belt ( Figure 2). The gait analysis session consisted in three consecutive trials spaced by a pause of at least one minute. Each trial consisted in stand quietly for 30 seconds, walk for 7 meters, turn 180 degrees around a pin and walk back to the start point ( Figure 3).
The Instrumented Stand and Walk (ISAW) protocol, included in the Mobility Lab software, allows to compute several kinematic parameters. In this study, fourteen parameters related to postural sway, anticipatory postural adjustment (APA) during step initiation, gait and turning were evaluated and for each patient the mean value of the three trials was reported for each motion parameter. The computed parameters are listed as follows along with a specific brief description.

Statistical and machine learning analysis
Given the low sample size, we performed a nonparametric paired test. Two tailed Wilcoxon matched-pairs signed rank tests (95% confidence level) were chosen with the following definition of statistical significance: p-value < 0.05. IBM SPSS Statistics (version 26) was employed. The statistical tests were performed to find if a statistically significant difference could be ascertained between the admission and the discharge of the patients for each kinematic parameter and even for each of the following clinical scales: Berg Balance (BERG) Scale, Unified Parkinson Disease Rating Scale Moreover, a ML analysis was carried out to assess the degree of the separability between the admission and discharge measurements considering the motion parameters described above. Considering the small sample size, our data were doubled through an oversampling techniquespecifically, Synthetic Minority Oversampling Technique (SMOTE) was used-to perform a reliable ML analysis [29,30]. Because of the negative impact of irrelevant features on most ML algorithms, a filter method feature selection, based on the statistical significance of the features between the hospitalization and discharge phases, was carried out.
In order to assess the degree of separability between the two phases that implies a certain degree of clinical improvement we used Accuracy and Area under the Receiving Characteristic curve (AUCROC) as evaluation metrics. In fact, Accuracy is an adequate metric considering our dataset is perfectly balanced between the two classes, while AUCROC represents the degree or measure of separability showing how much the ML model is capable to distinguish between the classes. To measure the degree of difference between admission and discharge we implemented four tree-based ML algorithms: Random Forest (Ran-F), Rotation Forest (Rot-F), Ada-Boost of Decision Stumps (AB-DS) and Gradient Boost tree (GB-DT). Tree-based algorithms are the evolution of a simpler decision tree made more powerful in classification tasks.
Ran-F is an ensemble learning based on multiple decision trees. The assignment of a given instance vector to a specific class is due to a majority vote of the different decisions provided by each three forming the forest [31]. In this paper Information Gain (IG) Ratio was adopted as split criterion; no limit about the number of levels (tree depth) was considered and neither the minimum node size was set. About the forest options, we set a number of models equal to 100 and we used a static random seed.
Rot-F is an ensemble learning method. It combines the random subspace and bagging approaches with principal component feature generation to construct an ensemble of decision trees [32]. In this paper, the J48 was considered as basic classifier. Then, we considered a number of iterations equal to 10 and as projection filter the Principal Component Analysis (PCA).
AB-DS is a machine learning meta-algorithm in which the weights are re-assigned to each instance, with higher weights to incorrectly classified instances [33]. In this paper, an ensemble of decision stumps (decision trees with a single split) was considered. Moreover, we considered a number of iterations equal to 10 and a weight threshold equal to 100. GB-DT algorithm aims to minimize the loss function of the model by adding weak learners using gradient descent to find a local minimum of a differentiable function [34]. In this paper, the limit number of levels (tree depth) was set to 4. About the boosting options, we set a number of models equal to 100 and a learning rate equal to 0.1.
Leave-one-out cross-validation [35] was performed to evaluate the performances of the four predictive tree-based models considering the sample size. Moreover, a feature importance by means of the calculation of the IG was reported considering only the features which obtained a statistically significant difference between the admission and discharge phases ( Table 2). IG is an indicator of the amount of information provided by each feature [36]. The ML analysis was performed by means of the Knime Analytics Platform (Version 4.1.3), a well-known open source software widely used in the scientific literature for clinical studies [37][38][39][40][41]. Table 2 shows the results of two tailed nonparametric Wilcoxon matched-pairs signed rank tests carried out for each spatiotemporal parameter considering the values computed at admission (PRE) and discharge (POST). Abbreviations: s: significant (p-value < 0.05); ns: not significant (p-value > 0.5); n.p.: not provided. Table 3 shows the results of two tailed nonparametric Wilcoxon matched-pairs signed rank test considering the score of the three clinical scales used to evaluate-at hospitalization (PRE) and at discharge (POST), after two months of rehabilitation treatment-the PD patients clinical picture.  Table 4 shows the results of ML analysis carried out by means of the four tree-based algorithms in terms of Accuracy and AUCROC. Finally, Figure 4 shows the Feature importance computed by means of the IG considering (see Table 2) only the features which resulted statistically significant.

Discussion
Our study shows that the short-term rehabilitation treatment induced-on the considered cohort of PD patients-significant improvements in gait more than posture, according to a well-established role of rehabilitation in PD treatment [42]. In addition, the results also show how simple spatiotemporal parameters of motion analysis can reliably detect the improvements resulting from rehabilitation.
In patients with PD, Stride Length reduction is considered the most important characteristic of gait; this particular case is often coupled with a reduction of Stride Velocity and a tendency towards an increase in the duration of Double Support Phase [7,43,44]. Considering the gait related parameters of the MobilityLab ISAW test, Double Support, Stance and Swing resulted statistically significant-pvalues = 0.023 (see Table 2)-and improved according to the considerations already mentioned and considering the relative normal ranges reported in Table 2. Serrao et al. [45] observed a similar trend for these features analyzing the 12 m walk of 36 PD patients whose kinematic parameters were computed using an optical system. Even Stride Velocity and Stride Length demonstrated statistically significant (p-value = 0.028 and 0.023 respectively). Considering Stride Velocity, the statistical analysis has in fact showed that this parameter increased after the rehabilitation treatment from 64.4 to 70.9 %stature/s in mean value. Bouça-Machado and co-workers [46] showed a similar increment for this kinematic parameter, which again resulted statistically significant (p-value = 0.004). Another confirmation has also been observed in the paper by Serrao et al. [45] which ascertained Stride Velocity increased after a rehabilitation treatment of 10 weeks. Similarly, we observed Stride Length increased from 73.0 to 78.4 in mean value; again, Serrao and co-workers [45] confirmed such trend. Finally, the same trends-namely, increments for Stride Velocity and Stride Length-were further confirmed by Kleiner and co-workers [47] which designed a rehabilitation treatment (based on the use of Automated Mechanical Peripheral Stimulation) to quantify the gait spatiotemporal parameters using a single inertial sensor.
Considering the APA related parameters, we found that 1 out of 3-namely, First Step Lengthvaried in a statistically significant way (p-value = 0.008) between admission and discharge. Indeed, we observed First Step Length passed from a mean value of 28.9 to 35.7 degrees. Differently, none of the two parameters considered for Postural Sway and Turn resulted statistically significant using the two tailed Wilcoxon test. To the authors' best knowledge, papers which use a similar methodology are still lacking: therefore, our pilot study finds for the first time that the MobilityLab ISAW test and the APA related parameter First Step Length would convey information of note for PD patients' rehabilitation outcome. Table 3 shows that the clinical scales scores considered in this study improved after the rehabilitation treatment, quantitatively corroborating a statistically significant increment of patients' motor recovery. Specifically, for the 12 patients we observed the mean BERG score increased from 29.0 to 40.5 (p-value = 0.005), UPDRS III mean score passed increased from 27.7 to 22.4 (p-value = 0.008) and the FIM Motor mean score increased from 57.3 to 71.9 (p-value = 0.007). Serrao and co-workers [45] and Bouça-Machado and co-workers [46] showed a similar trend for the UPDRS III mean score for the 36 and 22 patients considered in their studies.
Several studies proposed in the scientific clinical literature aimed to distinguish patients with PD from other forms of parkinsonism or from healthy control [25,[48][49][50], but at the best of our knowledge no works used ML to study the rehabilitation outcome of patients after the rehabilitation.
Finally, ML results further confirmed the positive impact of the short-term rehabilitation treatment. In fact, as showed in Table 4, high scores in evaluation metric were achieved both for accuracy and AUCROC. It is worth noting conventionally AUCROC values > 0.70 are considered to represent a moderate discrimination, values > 0.80 a good discrimination and values > 0.90 an excellent discrimination. Thus, from the outcome of our investigation it is possible to claim that all the algorithms, except GB-DT, reached AUCROC values greater than 0.90 confirming the high degree of separation between the two classes (namely, PRE and POST). This way to use ML is proposed in other works even if in a different field [20]. This finding implies patients' admission and discharge were effectively discriminated by the implemented tree-based ML algorithms, as also underlined by the high Accuracy scores, all greater than 0.90 except for the Rot-F algorithm.
Finally, the features importance by means of the calculation of the IG confirms that (see Figure 4), among the statistically significant parameters, Stride Velocity is the feature that determines the maximum separation between the two classes, namely PRE and POST. The positive impact of the rehabilitation treatment on this parameter is underlined elsewhere [51] and is in line with our result.
Being aware that this should be considered a preliminary study, the presented findings have several intrinsic limitations. Firstly, the exclusion criteria have limited the number of eligible patients from 48 to 12; a larger study cohort would have allowed us to collect more data which-we are confident-would have corroborated more strongly the presented results. Secondly, two patients out of the 12 were not perfectly matched with the other ones in terms of age; nevertheless, they were included in order to not excessively reduce the study sample considering they were affected by idiopathic PD. Thirdly, we presented in this study only a few of the spatiotemporal parameters which can be computed using the Opal coupled with the MobilityLab ISAW protocol; further analyses will be needed to verify whether other ISAW related parameters could quantify the rehabilitation outcome. Finally, only a preliminary ML analysis was performed; the design and implementation of improvements which can allow to overcome the previous limitations could extend the possible methodologies and strengthen the results. For instance, a larger dataset would limit or exclude data augmentation (i.e., SMOTE) and maybe allow to consider a different validation procedure and/or even other ML algorithms.

Conclusions
The study showed a method to automatically ascertain the short-term rehabilitation outcome of patients with PD. The Opal-coupled with the MobilityLab software-was used to compute several spatiotemporal parameters-related to the ISAW test-for the PRE and POST classes. The six statistically different parameters indicate such methodology can be readily used to corroborate clinicians' evaluations for PD rehabilitation assessment. Furthermore, this methodology could have great potential for other applications where the ISAW test-or other tests protocols yet uploaded in the MobilityLab-could be used to pursue the same or quite similar goals.