Kinect and wearable inertial sensors for motor rehabilitation programs at home: state of the art and an experimental comparison

Emerging sensing and communication technologies are contributing to the development of many motor rehabilitation programs outside the standard healthcare facilities. Nowadays, motor rehabilitation exercises can be easily performed and monitored even at home by a variety of motion-tracking systems. These are cheap, reliable, easy-to-use, and allow also remote configuration and control of the rehabilitation programs. The two most promising technologies for home-based motor rehabilitation programs are inertial wearable sensors and video-based motion capture systems. In this paper, after a thorough review of the relevant literature, an original experimental analysis is reported for two corresponding commercially available solutions, a wearable inertial measurement unit and the Kinect, respectively. For the former, a number of different algorithms for rigid body pose estimation from sensor data were also tested. Both systems were compared with the measurements obtained with state-of-the-art marker-based stereophotogrammetric motion analysis, taken as a gold-standard, and also evaluated outside the lab in a home environment. The results in the laboratory setting showed similarly good performance for the elementary large motion exercises, with both systems having errors in the 3–8 degree range. Usability and other possible limitations were also assessed during utilization at home, which revealed additional advantages and drawbacks for the two systems. The two evaluated systems use different technology and algorithms, but have similar performance in terms of human motion tracking. Therefore, both can be adopted for monitoring home-based rehabilitation programs, taking adequate precautions however for operation, user instructions and interpretation of the results.

prevention, and the search to provide personalized and patient-centric solutions. Both trends are enabled by unobtrusive sensing technologies, allowing for continuous monitoring and increased engagement with the patient outside the clinic [2]. Movement analysis and its use for motor rehabilitation is one of the many application fields where innovative technical solutions for unconstrained and autonomous monitoring of the patients are being adopted [3].
Standard practices for motor rehabilitation include the clinician's supervision and evaluation of the patient's movements, when performed during therapy sessions in clinic, and no supervision or any feedback when the exercises are executed at home. Computer vision and stereophotogrammetry-based technologies have been widely proven as accurate and reliable tools for objective measurement of human motion [4,5]. However, the costs and difficulties of operation of such systems have limited their use to research rather than in everyday clinical and rehabilitation practice. The development of miniaturized inertial sensors paved the way for the development of wearable Inertial Measurement Units (IMUs) and their use for motion capture [6,7]. Such technologies have also been validated in lab environments for medical applications and motor rehabilitation analyses [8,9]; however, the available solutions involve cost and complexityrelated limitations.
Nowadays, both research and commercial applications are experiencing a push in ubiquitous computing and the use of wearable and interconnected sensing devices for a wide range of applications, from entertainment to fitness and wellbeing [10]. The adoption of the use of fitness and activity trackers is driven by their low cost and ease of use, but these have usually limited accuracy in the reported data [11]. For a successful adoption of these new technologies in rehabilitation, there is a need to evaluate their accuracy and reliability and to provide insights on their proper use in order to define best practices and standardized protocols [12]. The recent innovative low-cost sensing solutions and relevant algorithms for data analysis, once validated, can be effectively introduced in rehabilitation protocols both in specialized centers and at home, and truly enable a patient-centric, preventive and smart healthcare revolution [13].
In the field of human motion analysis, both video and inertial-based solutions have now low-cost options, suitable for wide adoption and everyday use; examples include the Kinect camera [14] and various activity tracking and wearable inertial sensors [15]. Their integration into bio-feedback-based systems and combination with exergames and appropriate back-end infrastructure allows for the development of innovative solutions for real-time monitoring of home-based rehabilitation therapies and for a continuous remote supervision by the clinician [16]. The first platforms providing such functionalities include DoctorKinetic (DoctorKinetic, Netherlands), SilverFit (SilverFit, Netherlands) and Riablo (Corehab, Italy). This paper reports an overview of these major systems, analyzing in the literature the state-of-the-art of the Kinect and of wearable motion sensing in rehabilitation, but mainly focuses on a validation work for the quantitative assessment of these systems. Since there is a lack of direct comparison and discussion on the differences of the two technologies, an original experimental study was performed and here reported to evaluate and directly compare the Kinect v2 and a commercially available wearable IMU (EXLs3 by Exel srl, Italy). Their technological characteristics and state-of-the-art

Review of sensing technologies for motion analysis in rehabilitation
This section will analyze current systems in human motion analysis, starting from the well-established video motion capture used in gait laboratories, and then focusing on innovative and low-cost alternatives, suitable for autonomous use at home. The reported references are summarized and compared in Table 1.

Video-based motion capture
The use of cameras and computer vision algorithms for the analysis of human motion is a well-established application field, and has notable contributions from both research and industry [17]. Video-based motion capture and Marker-Based Stereophotogrammetry systems (MBS) are now the de-facto standard for high-precision applications, including biomechanics research and clinical gait analysis [4].
In MBS systems, multiple cameras employ Infra-Red (IR) illuminators and triangulation algorithms to track the 3D position of reflective markers moving within a calibrated field of view. When used for human motion capture, the subject is instrumented with a set of reflective markers to identify and track relevant anatomical landmarks, and the system uses their positions to reconstruct and track subject's body segments and joints [18]. These systems have been proven to offer accurate and reliable motion tracking and are being widely used in human motion research and clinical studies. The established accuracy is less than 1 mm error for the position of single markers, which translates in the errors in the range of 1-4 degrees for the estimation of joint angles, according to the specific marker cluster configuration [18,19]. There are a number of commercially available systems, equipped with high-performance cameras and different software solutions for out of the box motion analysis, including Vicon Nexus (Vicon Motion Systems, UK), Elite (BTSengineering, Milan) and Optitrack Motive (NaturalPoint, USA).
The main downside of the MBS is the high cost and the complexity of its setup and use. To address these issues, research solutions explore the use of marker-less motion capture systems and their integration with depth sensors [20,21]. Despite promising results, the accuracy and reliability implied with these new techniques do not yet meet the needs of healthcare applications, due to cumbersome hardware and extensive data processing requirements [22,23].

Low-cost video sensing: the Kinect
Microsoft first introduced the Kinect sensor in November 2010 to be used as a motion capture input device, as an add-on for the Xbox game console. It featured a standard digital video camera, a depth sensor based on structured IR illumination, and a directional microphone. The integration of the Kinect with dedicated algorithms allowed markerless tracking of the user's segments pose and movements, creating a natural user interface based on gestures [24]. Although it was developed and sold as a game controller, its offer of RGB video and IR-based depth sensing (RGB+D), at a very low price, made it appealing for a wide range of users, also in biomechanical and clinical research [25,26]. With the availability of drivers and of a Software Development Kit (SDK) for a more  general use beyond gaming, the Kinect has been applied to a vast range of academic and industrial projects, including the fields of interaction, robotics and, in fact, biomechanics [27]. The first version of the Kinect (Kinect v1) was followed by a re-designed sensor presented in 2013 (Kinect v2), which introduced an improved RGB camera and a new IR time-of-flight depth sensor [28]. Kinect v2 and its new SDK improved the sensor's tracking capabilities and enhanced its use in applications based on human motion tracking [29]. The Kinect sensors have been extensively evaluated in relation to several application fields. The accuracy of the sensors and their depth estimation capabilities have been analyzed carefully [30], as well as the differences between the two versions [31][32][33]. Focusing on human motion capture applications, the use of the Kinect v1 in such scenario was triggered by the release of reverse-engineered open-source drivers and tracking software [34] and then propelled by the release of the Microsoft's SDK [35]. The second-generation device and its updated algorithm have been validated further within the context of clinical motion analysis, with applications such as posture and balance evaluation [36,37], fall detection [38], rehabilitation exercises [39][40][41], and gait assessment [42][43][44]. Moreover, the usability of Kinect-based home rehabilitation systems has been investigated, providing insights on the user acceptance with good results and indications for future improvements [45,46].
The two generations of Kinect sensors have been compared in validation studies: when applied to posture or movement evaluations, these showed similar results, with the Kinect v2 just slightly outperforming its predecessor [47,48]. The new sensor achieved good overall performance in the tracking of human pose and elementary movements, but showed obvious limits when dealing with more complex exercises or when the movements were not performed with the subject standing facing the sensor. These results necessarily reduce the use of the Kinect as an accurate tool for possible exploitations in the clinical context, but open the door for a possible use in somehow qualitative evaluations of posture and exercise, and also show the potential of such system for athome monitoring of rehabilitation therapies.

Inertial-based motion capture
The availability of Microelectromechanical Systems (MEMS) and their development for miniaturized sensors, combined with integrated processing and communication technologies, enabled the development of wearable sensing devices for human body monitoring [1,15]. To obtain information regarding specific human locomotion parameters, one or more sensing devices are worn directly on relevant body parts and connected to a central processing hub for data collection and processing, forming a so-called Body Sensor Network [69]. However, this multi-sensor setup presents a number of technological requirements in terms of sensing capabilities, signal bandwidth, throughput and other general challenges such as device wearability, system usability, and data reliability [70].
There are several commercial examples, starting with high-end solutions for body motion capture [71], which are mainly used for animation and clinical movement analysis, all the way to ubiquitous motion trackers and sensors embedded in smartphones [72]. Notable examples include MVN Biomech (Xsens Technologies, Netherlands) and Opal (APDM Technologies, USA). The research and academic community is also very active on this topic, with several proposed platforms [73][74][75][76][77][78].
A wearable IMU provides unobtrusive methods to collect motion data relative to the body segment where it is worn; by combining a network of sensors, to form a whole-body model, joint motion can also be deduced. The integration of multiple sensors within the same device (accelerometer, gyroscope and magnetometer) allows to deploy robust sensor fusion algorithms in order to provide reliable and detailed information in a wide range of dynamic conditions and application contexts. In biomechanics, the most used application is the estimation of the device's orientation from the embedded sensors and its use for the estimation of joint angles [79,80]. Algorithms derived from navigation applications are adapted to infer the orientation of the body segment of interest and include the Kalman Filter (KF), its extended and unscented variations, and also several implementations of Complementary Filters (CF) [81][82][83][84]. Moreover, IMU sensor data can be exploited to analyze various features of human motion and dedicated algorithms have been developed for tasks such as activity recognition [85], exercise recognition and evaluation [86,87], gait analysis [63,88,89] and jump analysis [90,91].
Research and clinical studies have validated the use of wearable IMUs also in various conditions and applications [80,92]. Notable examples include balance and postural evaluation [93,94], fall monitoring and prediction [95], gait analysis [96] and rehabilitation [64]. Laboratory evaluations and comparisons with high-precision MBS systems have shown high accuracy and reliability of wearable motion sensors. Hence, these can be used in clinical practice for the evaluation of human motion and can provide a valuable and portable tool for standardized motor tests [97]. Usability aspects of the employment of such systems for home rehabilitation evaluation have also been investigated providing encouraging results [98][99][100]. However, the scientific community still faces the challenges implied in the development of accurate, reliable and easy to use wearable solutions for motion analysis, and in their extensive validation in real-life contexts.
Another emerging approach is to combine the outputs of the two systems [66,67,101]. In [67] the authors propose a sensor fusion algorithm to combine the Kinect and a set of wearable IMUs showing how the combined result achieves higher accuracy than any of the two systems. Additionally, in [66] an integration of IMUs and Kinect for the tracking of upper limb motions is proposed, showing again improved results when compared to the two separate systems. The work in [68] compares the use of IMUs and a Kinect-based system (Reha@Home) for gait analysis. This comparison, however, was limited to only one subject performing a short walk in front of the camera. In addition, the typical major problem of Kinect, i.e., the occlusions of body segments during the motion exercise, was promoted by the experimental setup adopted and the exercise performed.
Separately, the different systems have been extensively evaluated, but direct comparison and the discussion of their tradeoffs are still very limited. Moreover, all the reported studies focus on laboratory-based validation, despite huge potential of such systems lies in at-home use and therefore these systems should be evaluated also in this context. This work provides a detailed analysis of the literature on the two approaches (see Table 1) and an original comparison performed both in laboratory with a state-of-the-art motion tracking reference and in different home settings.

Review of segment and joint kinematics estimation algorithms
The estimation and tracking of human segment and joint kinematics using video or wearable sensing is a well-documented research field, with several available solutions. This section will outline the main methods used in this work with the Kinect and with inertial sensors, whose estimates will be directly compared.

Kinect
Microsoft provides a comprehensive SDK for the Kinect, which includes a ready-to-use algorithm for the estimation and tracking of the user's complete body pose. The latest update provides real-time tracking for up to six people and it provides estimated 3D position for a complete skeletal model formed of 21 body joints and quaternion-based rotations of the relevant segments. The algorithm is based on the identification of the different body segments from the RGB+D video stream and uses a Random Forest recognition approach, which was trained with a wide dataset composed by real and synthetic data [24]. The research community has proposed some alternatives and there is still on-going work on pose estimation from RGB+D streams [102]. However, Microsoft's solution is the de-facto standard thanks to its robustness and ease of use. For these reasons, it was used in several validation and exploitation studies [36,39,42,43] and is also used in the present work. The provided data are only low-pass filtered to eliminate noise. Further offline smoothing or any other processing was avoided because, in the present study, real-time tracking of the exercise was targeted.

Inertial sensing
Most of the previous validation studies used commercial solutions to obtain the orientation of the wearable sensors, which are used to estimate the orientation of the body segment they are attached to. Their outputs are then combined to form a partial or complete body pose estimation, based on the number of sensors in use [9,[59][60][61][62][63][64]. While there are several proposals for algorithms for the estimation of orientation from inertial sensors' data, the present work analyzes the most used ones, to provide a comparative analysis targeting robust and well-established solutions. Moreover, to evaluate the standard use at-home of these systems, robust but ready-to-use approaches, without the need for system calibration or additional operations, were considered. In particular, although all the proposed methods provide the full orientation of the device, its horizontal component is not frequently considered, since it is heavily affected by the environmental ferro-magnetic disturbances. Although inertial and magnetic sensors may be influenced by the environment (e.g., temperature [103,104]), environment-aware calibration and rejection techniques are out of the scope of the present work. The EXLs3 sensors used in the present study are calibrated in factory, and all the experiments had a limited duration with standard and stationary environmental conditions, therefore effects from the environment are assumed to be null. All the orientation estimation algorithms are based on a combination of triaxial sensor inputs composed by accelerometer readings a = {a x , a y , a z } , gyroscope readings ω = {ω x , ω y , ω z } and magnetometer readings m = {m x , m y , m z } , providing as output the orientation of the sensor, expressed either in quaternions ( q = {q 0 , q 1 , q 2 , q 3 } ), Euler . Each of these three orientation representation methods has its advantages, but usually the quaternions are preferred for the computation efficiency and the results are converted to Euler angles because of their better clarity [105].

Orientation estimation from accelerometer (ACC)
Using accelerometer (or accelerometer and magnetometer) outputs, the sensor's orientation is estimated by applying trigonometric functions. This approach assumes that the accelerometer is measuring only the gravity acceleration, and hence it is reliable only in static conditions. Accelerometer readings are used to estimate a partial orientation of the device, E a as From the magnetometer measures, the missing horizontal heading is estimated as where m a is the magnetometer reading projected to the accelerometer-estimated orientation plane identified by E a .

Gyroscope integration (GYR)
Orientation of the sensor can also be estimated by integration of the angular velocity provided by the gyroscope. This estimate is reliable in dynamic situations, but suffers from drifts due to numerical integration errors. According to the chosen orientation representation, there are several implementations of its derivative; here, the quaternion one is adopted resulting in q = q , which is integrated as where I is a (4 × 4) identity matrix and q(0) computed using the ACC estimation during a short static initialization.

Kalman filter (KF)
The Kalman filter is a widely used approach for optimal fusion of accelerometer and gyroscope orientation estimates [79,81]. Several variations have been proposed and here a straightforward application of a quaternion-based KF is applied, using the GYR derivate as the state equation, which is then corrected by the ACC measurement. The KF state and measurement equations are implemented as where q(t) is the state estimate, w ∼ N (0, R) the zero-mean gaussian process noise with covariance matrix R , q a the accelerometer-based orientation estimate and v ∼ N (0, Q) the measurement noise with covariance matrix Q . The two covariance matrices were set to be diagonal with constant coefficients: 0.0001 for Q and 0.1 for R.

Madgwick filter (MAD)
Another approach for a quaternion-based iterative fusion of ACC and GYR estimates has been proposed by Madgwick [84] and it has been well received because of its highquality estimate and limited computational and memory requirements. It is based on a gradient descent algorithm, which iteratively finds the optimal orientation given the input signals and it is governed by the following differential equations: The filter calculates the orientation q by numerically integrating the estimated orientation rate q est , which is computed as the rate of change of orientation measured by the gyroscopes, q ω , with the magnitude of the gyroscope measurement error, β , removed in the direction of the estimated error, q a , computed from accelerometer and magnetometer measurements. q a is computed with the gradient descent method and f represents the function that provides the orientation from accelerometer and magnetometer readings. The implementation of the algorithm makes use of established matrix and quaternion operations, and the correction parameter β was empirically set to 0.01 [84].

Complementary filter (CF)
Another class of orientation estimation algorithms was developed by Mahony et al. using non-linear complementary filters [83]. Such approach is also becoming popular for its accuracy and reduced computational complexity when compared to KF. In this case, a rotation matrix representation is used and, contrary to the KF, the filter combines accelerometer and gyroscope estimates with a constant correction factor, following the equation: where R is the rotation matrix, k p the filter gain empirically set to 0.5 and γ the correction term given by the difference of the previous estimate and the current one from the accelerometer [83].

Results
The experimental part of this work directly compared the two systems in a laboratory setting, using a high-precision MBS tracking system together with an internationally established technique as the gold standard. In addition, the use of these two systems out of the laboratory was compared, performing a pilot evaluation in unconstrained environments such as patient's house. Detailed description of experimental methodology and protocols is provided in "Methods" section. Table 2 reports the mean differences, i.e., root mean square errors (RMSE), and the standard deviation for the considered techniques for IMU orientation estimation and for the estimates provided by the Kinect, when compared to the corresponding results established through MBS. The different IMU approaches have a similar performance, with the KF and MAD algorithms outperforming the others. It is interesting to note that the single-sensor algorithms have only limited degradation compared to the sensor fusion ones and in some cases even outperform the CF. This is mainly due to a combination of the type of performed exercises (large motions with a relatively low dynamic) and their short duration, which allow also ACC or GYR estimates to have limited RMSE. The IMU estimates employing sensor fusion algorithms outperform the Kinect's output, though by a limited margin, revealing that both approaches (by sensors or cameras) have a good overall performance, with errors in the range of 3 to 8 degree for all the joint angles analyzed, which is consistent with existing literature [39,55,[59][60][61]68]. As expected, the exercises performed while wearing clothes show a slightly higher RMSE and deviations; however, they are consistent with the standard GA case. The comparison between the different approaches confirms again that the sensors fusion algorithms outperform the single-sensor estimates and are aligned with the Kinect ones. Given the limited sample size, a thorough analysis on the performance degradation cannot be provided; however, additional errors in this case can be attributed to the motion artifacts caused by clothing, which allows for relative motion also between the markers and the underlying anatomical landmarks and also between IMUs and markers. Figure 1 shows resulting angles for the frontal lunge exercise in standard gait analysis while undressed (left), and the corresponding with clothing (right) from the same subject. Patterns from both IMU and Kinect match generally well with the gold standard from gait analysis, with consistent results for all the executed exercises and across all users. Timing of the waveforms is exactly the same, whereas peak values show differences, though consistent over repetitions. These can be accounted for the different technique applied for calculation of orientation, i.e., IMU tracking a single limited area of the body segment, while the Kinect is searching for the overall orientation of this segment, this being affected also by its deformation during motion.

Table 2 Mean errors for various IMU orientation estimation algorithms and for the Kinect v2 when compared to the MBS output
The

At-Home evaluation
Without the availability of a gold standard, it is not possible to calculate errors for the two systems for a quantitative evaluation. However, it is possible to qualitatively evaluate the outcomes and compare them to the data collected in the lab. In particular, it is possible to compute the average difference and standard deviation between the two estimates and use such parameters to compare the lab and home sessions. Table 3 collects the root mean square difference, its standard deviation and the maximum differences between IMU and Kinect estimates. For the three subjects who performed the exercises both in the lab and at home, there is a direct comparison of the two environments, while the average results for all sessions performed at home provide a qualitative insight of the performance outside the lab. A sample of the joint angles resulting from at-home acquisitions is plotted in Fig. 2, where the frontal lunge exercise is shown to facilitate the comparison with Fig. 1 showing the same exercise from the same subject performed in the lab. All the performed exercises were correctly acquired by both systems in all the four test environments. A comparison of the outputs shows that they reported expected outcomes and the two systems show again similar performance. For the subjects monitored both in lab and at home, the difference between the two systems is consistent in both cases, with the exception of the knee flexion angle, which exhibits higher deviations in the un-controlled environments. The same outcomes are observed also for the average over all the subjects who performed exercises in the home environments. Considering both Figs. 1 and 2, it emerges that the systems show a difference in the estimated range of motion, with the Kinect underestimating it when compared to both the MBS reference   19:25 and the IMU output. Similar results were observed in all acquired sessions, though such behavior should be analyzed further to establish systematic evaluations of the outcomes from the two systems. Moreover, Kinect-based estimates show also considerable peak discontinuities, as depicted in Fig. 2, right column. This result was observed throughout the dataset and it can be caused by image glitches disturbing the vision-based tracking algorithm. Of course, this influences the reported measurements of estimate differences. To mitigate this effect, ad-hoc filtering or smoothing techniques may be applied in the future.

Discussion
In the last decade, the consumer market opened the way for a broader acceptance and use of wearable sensing devices. Activity trackers are now widely employed in everyday life, but with limited reliability and validation of results [11,106]. More accurate wearable inertial sensors have been adopted for a wide range of clinical applications [2,107], with a huge potential to innovate and improve nearly every aspect of healthcare applications. But for a successful exploitation of these systems in healthcare and in particular in rehabilitation, there is definitely the need for their careful quantitative validation. In addition to these, unobtrusive sensing systems based on video and depth cameras are available at a low price and high performance, such as the Kinect v2 here assessed. It was originally developed as an interaction controller for home video games, but it has gained attention also for general research and clinical applications for its capability to track human subjects' movements in real time [27,108]. With respect to inertial sensors, video-based tracking is even less invasive, as the body of the tracked subject is free of any instrument. Several studies have analyzed the performance and validated these two systems for the tracking of human motion in clinical applications, including postural and balance control, rehabilitation exercises, gait, or specific conditions such as Parkinson's or post stroke rehabilitation. Laboratory tests showed the limits of the low-cost tracking technologies when compared to state-of-the-art MBS systems; however, these also highlighted their overall applicability to ubiquitous patient monitoring (see Table 1 and the references therein). The development and adoption of innovative monitoring systems for effective patient monitoring in unconstrained environments open new research challenges in these systems' reliability, sensitivity to environmental and operational factors, usability and acceptability by the clinicians and the end users, i.e., the patients.
In the present study, a thorough experimental analysis was performed to assess the accuracy of two instruments for human motion tracking in the context of rehabilitation. The experiments are established however as preliminary measurements on a limited sample size. Nevertheless, the state-of-the-art gait analysis was arranged as gold standard, and a large number of exercises were analyzed. These were a limited specific set within all possible rehabilitation exercises, particularly used to recover from a large number of orthopedic disorders and treatments. The scope in fact was to test the two instruments in a number of general yet well representative motor tasks; in the future these two instruments shall be tested also in other possible exercises. It is important to note, however, that among those analyzed here, the squat position is definitely very physically demanding for the extreme joint positions implied, and as such particularly suitable to reveal large measurement differences. As additional limitation, the two clothing conditions were tested in a single subject only, but this was thought just to reveal the additional artifact introduced by the clothes used routinely in these exercises, knowing that the gold standard for these measurements is represented by the motion at the skeletal system. Validation against state-of-the-art gait analysis was performed in two different conditions, though in a very small number of subjects. The standard procedure always requires the subject to be undressed, with all the markers attached to the skin in correspondence of relevant anatomical landmarks. This is recommended for a repeatable application of the marker set (for intra-and inter-subject comparisons) and to avoid the disturbances of the clothing, which adds considerable artifactual measurements. However, this is not the typical condition for the users of these systems; therefore, the validation was repeated, in one subject, also imitating a more realistic dressing condition, with the user wearing comfortable fitness clothing, typical of physical exercises in the gym or at home. In the latter case, the measures were less accurate, but they are more representative of a real scenario. The preliminary results here reported for the two systems highlight the importance of instructing the users to perform the exercises with limited and appropriate clothing and to tightly wear the sensors to limit occlusions and motion artifacts. Although all the present sensing technologies are likely to be affected by environmental factors (e.g., temperature, humidity, etc.) and by their status (duration of use, etc.), a detailed analysis of such influences is out of the scopes of this work. The present experimental protocol was rather designed to minimize the impact of any such external factors and environmental conditions. Moreover, the aim was to limit the differences between the acquired sessions and with respect to the corresponding conditions in the relevant literature (Table 1).
The two systems showed similar performance in terms of final angle estimations when considering simple large-motion exercises. The measurements from this experimental work on both the laboratory and at-home sessions show good repeatability and consistency, therefore providing reliable evaluation of the performance of relevant rehabilitation exercises. However, the results also showed differences in the body segment orientations and therefore joint rotations, but these are consistent and small with respect to the corresponding overall range of motion. These findings are aligned with the reported literature, which generally reports errors below 10 degrees [40,109].
Today, there is no consensus on the necessary accuracy that these motion tracking systems should provide for these to be appropriate in physical rehabilitation. However, based on the existing literature [23,110], reports from therapists and physicians, as well as practical experience, errors in human segment or joint rotations smaller than 3 degrees would be tolerable for most rehabilitation programs in orthopedics; errors between 3 and 6 degrees can still be acceptable, depending on the joint, the pathology and treatment, and the status of the patient. For example, after the replacement of shoulder, hip and knee joints, the range of motion usually restored is far larger than 100 degrees, and this error therefore would be only a very small percentage. In this context, the two analyzed technologies perform well, and the errors here revealed can be well acceptable in most major human diarthrodial joints, compatible with the status of the patient and the rehabilitation exercises under observation. Direct or indirect, i.e., for athome sessions, careful supervision and evaluation should be guaranteed in any case by trained therapists. This is in any case a step forward with respect to qualitative observations, which is biased by therapist experience.
Nevertheless, the different basic technology of these two systems introduces additional considerations on their effective use. The Kinect is a well-supported commercial platform and benefits from its very simple operational requirements. To track movements, it just needs to be placed at 3-4 m in front of the subject and connected to a personal computer, without the need for additional instrumentation or further requirements. However, its vision-based approach imposes a limit on the tracked area, particularly a frontal view, and no object interposition; also, its low sampling frequency limits the range of movements correctly tracked. In particular, fast and complex movements as well as those with large components out-of-the-frontal plane of the sensor are not tracked by the system [44], thus precluding its use in applications such as real-life monitoring of patients and rehabilitation exercises performed while lying or with support devices. In addition, its limited field of view precludes its use for unconstrained gait monitoring.
Wearable IMUs are now a mature and widely adopted technology, with several commercial solutions ranging from whole-body motion tracking suites to sensor kits and stand-alone units. The use of IMUs attached to a target body segment and the adoption of relevant sensor fusion algorithms is nowadays commonly employed to analyze human motion within a large spectrum of motor tasks and exercises, from up-right posture to complex sports activities [109,111,112]. IMU use for clinical motion analysis has been extensively evaluated regarding accuracy and reliability, but evaluation studies are mostly confined to laboratories [64,93,96]. Considering at-home uses, wearable IMUs have an additional requirement when compared to the Kinect, since the user has to wear the sensors. Such operation usually consists in mounting a simple elastic band, which can be considered simple enough for autonomous use at home even for children and elderly, but it can be, in theory, a source of uncertainty (i.e., sensor misplacement) or it can be problematic for severely impaired users. On the other hand, wearing the sensors on the user's body allows for a less-constrained tracking and for the development of a mobile solution capable of acquiring movements in a truly unconstrained and pervasive manner. The vast range of available sensors, paired with state-of-the-art processing algorithms, allows for the development of diversified solutions covering a wide spectrum of human motions, including static and postural analysis, rehabilitation exercises, jump analysis, gait analysis, fall detection, etc.

Conclusions
This work addresses two of the most promising technologies for at-home rehabilitation monitoring based on real-time motion analysis, i.e., wearable IMUs and Kinect. The Kinect incorporates video and depth sensors and provides easy to use, real-time, full-body tracking at a low price. Wearable inertial sensors are now emerging as another reliable tool for movement analysis, providing an additional instrument for patient monitoring also in clinical and research settings. In the first part of this study, a detailed critical analysis of the literature on these technologies was performed (see Table 1), and in the second part original comparisons between the two are reported, after thorough experiments performed both in a state-of-the-art motion capture laboratory and in direct home settings.
From the literature it emerged that the two different technologies have been assessed extensively, though mostly separately, with very limited direct experimental comparisons. In addition, only a few studies have addressed the final real conditions of use, i.e., Page 19 of 26 Milosevic et al. BioMed Eng OnLine (2020) 19:25 at-home. Therefore, an original experimental analysis was performed, in both environments. The two systems showed similar performance in tracking elementary exercises with large range of motion, and provided comparable results both in the laboratory setting and in-home tests. In the former, IMUs combined with different sensor fusion algorithms showed an average RMSE of 5.5 • (±2.3) over the performed exercises, which matches well with those from the Kinect, 5.6 • (±2.0) . These exercises were replicated with the same experimental protocol and with the same users in home environments, showing results much in support of those obtained in the laboratory. The Kinect has the advantage of very simple operational requirements, but it lacks the capabilities to track complex and highly dynamic movements, especially when the user does not move in front of the sensor. On the other hand, IMUs must be worn, but work well in a large variety of human movements, also at high speed. Both technologies, however, can be adopted for home-based rehabilitation monitoring, after taking adequate precautions about user instructions and about correct interpretation of the results. With further developments and large-scale real-life evaluations, these technologies will allow careful and pervasive patient monitoring and relevant clinical studies in the near future.

Methods
This section describes the methodology and the comparison protocols employed for the experimental analysis. Our institution's Review Board (Comitato Etico dell'Istituto Ortopedico Rizzoli) approved the study conducted in the present work. All participants received detailed information about the study and provided written consent for the use of acquired data. All acquired data were anonymous and only age, gender, weight and height were stored along with the exercise data here reported. The subjects were recruited among graduate students at our institution.

Laboratory evaluation
The direct instrumental comparison of the two systems was performed at the Movement Analysis Laboratory of the Rizzoli Orthopaedic Institute (Bologna, Italy) as shown in Fig. 3. Subjects' motion was concurrently monitored by a Kinect v2 (Microsoft, Seattle, USA), a set of three EXLs3 wearable IMUs (Exel srl, Bologna, Italy) and a high-precision 8-camera MBS motion tracking system (Vicon 612, Vicon Motion Systems Ltd, Oxford, UK) sampling at 100 Hz.
During the acquisitions, the Kinect was placed in front of the subject, at a distance of approximately 3.50 m, and at 1 m from the ground (Fig. 3). It was checked whether the subject was at the center of the field of view of the sensor, as recommended from the product guidelines. The Kinect4Windows 2.0 SDK was used for data acquisition and processing. It provides the reconstruction of the full body segments, formed by the position and angles of 21 joints [24]. These data were saved for offline analysis by means of a custom application. The SDK does not allow control over data acquisition and it provides an approximate sampling rate of 30 Hz.
For IMU tracking, a 3-sensor kit of EXLs3 wireless IMUs was used. This study focused on the evaluation of lower limbs movements, hence the three sensors were placed on the frontal aspects of the subject's thorax and of left thigh and shank. The devices are self-worn using elastic bands with a dedicated pocket for the IMU. Each EXLs3 device is calibrated in factory and provides an on-board estimation of its orientation, in addition to triaxial sensor data for accelerometer ( ±2 g full scale), gyroscope ( ±500 dps full scale) and magnetometer ( ±1200 µT full scale). These are equipped with a Bluetooth transceiver for data streaming to a host device. In the performed tests, sensor data were sampled at 100 Hz and streamed to a personal computer for offline analysis. Given the placement of the IMUs and combining the orientation of the three sensors, it is possible to estimate the thorax sagittal and frontal orientation, the hip joint sagittal and frontal angles and the knee joint flexion/extension. As a gold standard reference, a state-of-the-art MBS motion capture system and an established gait analysis protocol were used. Before starting the data collection, 33 spherical 15-mm reflective markers were located on the lower limbs, pelvis and thorax in correspondence of known anatomical landmarks according to a validated protocol [113]. From these markers, anatomical-based reference frames were defined for each segment, and three-dimensional joint rotation angles were calculated according to international recommendations and conventions [114]. Thorax sagittal and frontal plane inclinations, hip joint sagittal and frontal angles and knee sagittal angle, i.e. flexion/extension, from these measurements and calculations were used as the gold standard for the comparison of the corresponding Kinect and IMU-based estimates. These gait analysis results were stored for offline comparative analysis.
The study involved three healthy subjects (female 1.75 m 26 years, female 1.65 m 31 years, male 1.83 m 34 years) who performed physical exercises typical of rehabilitation programs after replacement of lower limb joints. For all three, standard gait analysis was performed which implies instrumenting the subjects without clothing (Fig. 3  left). This is considered the optimal experimental setting, with the best possible accuracy of the measurements because of the direct attachment of the markers to the skin without interposition. For one of the three subjects gait analysis was repeated days later while wearing comfortable fitness clothing (Fig. 3 right). It is worth noting however, that when collecting data while wearing clothes, MBS measurements are likely to be affected by noise, since the markers are attached to the clothing and some tissue motion artifacts are inevitable. The subjects were wearing adherent fitness clothing, which can limit this motion artifacts. The three subjects were first instructed about the functioning of the acquisition systems and how to wear the inertial sensors. In addition to squat (SQ), the following six exercises were performed by the left leg only: frontal lunge (FL), lateral lunge (LL), hip abduction (HA), hip flexion (HF), and hip extension (HE). These motion exercises include both basic and more complex movements and are typical of many rehabilitation programs targeting lower limbs functional recovery [115,116]. For each exercise, the subjects were instructed to perform five repetitions as for standard correct execution first, i.e., with the trunk up-right, and then five more repetitions with the trunk in a large inclination forward, to mimic a common mistake in performing these rehabilitation exercises [115,116]. The overall quality of the exercises was assessed by analyzing thorax orientation and hip and knee joint rotations; among these measurements, target parameters, i.e., those to determine the biofeedback, and control parameters, i.e., those to be checked for a correct performance of the exercise, are specified in Table 4.
Spatial and temporal alignment of the reference frames from the three systems was performed offline. A short static up-right double-leg posture of the subject was acquired at the beginning of each data collection session and used to align the body segment orientations provided by the three systems. Moreover, a sharp right leg movement was performed at the beginning of a session to facilitate offline time alignment of the data streams. All data were stored for offline analysis, which were performed in Matlab. For a direct comparison, the joint rotations streams from the three systems were all re-sampled at 30 Hz.

At-Home evaluation
One of the main advantages of these two innovative approaches for human motion tracking is their low cost, which together with their small dimensions offer the possibility for ubiquitous adoption in rehabilitation centers, gyms and even at home. In addition to lab comparison, therefore, a pilot study was conducted to evaluate their use in the latter uncontrolled environment. To test the variability associated to different environmental conditions in real-life scenarios, the two systems were used to collect data in five additional locations. In particular, two homes and three different office spaces were used, where a total of 10 subjects were asked to perform the same set of exercises as during the laboratory evaluation. The same three subjects who performed the exercises in the laboratory were also among the home test group, to allow for a direct comparison of their performance. The spaces were different in dimensions and lighting conditions, going from a small office with artificial light to a large living room under direct sunlight. All sessions followed the same protocol as for the lab evaluation, except for the MBS-based gait analysis and the reference tracking, which was not available outside of the lab. At each location, the Kinect was positioned in the best position according to the environment and each user was asked to autonomously set up the IMU sensors while wearing comfortable fitness clothing and then asked to perform the exercises according to precise instructions by the operator.