Kinematics-Based Predictions of External Loads during Handcycling

The increased risk of cardiovascular disease in people with spinal cord injuries motivates work to identify exercise options that improve health outcomes without causing risk of musculoskeletal injury. Handcycling is an exercise mode that may be beneficial for wheelchair users, but further work is needed to establish appropriate guidelines and requires assessment of the external loads. The goal of this research was to predict the six-degree-of-freedom external loads during handcycling from data similar to those which can be measured from inertial measurement units (segment accelerations and velocities) using machine learning. Five neural network models and two ensemble models were compared against a statistical model. A temporal convolutional network (TCN) yielded the best predictions. Predictions of forces and moments in-plane with the crank were the most accurate (r = 0.95–0.97). The TCN model could predict external loads during activities of different intensities, making it viable for different exercise protocols. The ability to predict the loads associated with forward propulsion using wearable-type data enables the development of informed exercise guidelines.


Introduction
Full-time manual wheelchair users, such as persons with spinal cord injuries, are at increased risk of chronic diseases such as cardiovascular disease (CVD), diabetes, and metabolic syndrome [1][2][3].Although the evidence supporting the benefits of physical activity is clear, many individuals with spinal cord injuries are physically inactive and sedentary leading to worsening symptoms and increased levels of disability [4,5].Unfortunately, the transition to manual wheelchair use presents challenges as the arms become the source of all activities of daily living.For example, as many as 68% of manual wheelchair users have rotator cuff tendon tears [6].An optimal exercise for improving cardiopulmonary function requires sufficient taxation of cardio-respiratory fitness levels without further increasing the risk of musculoskeletal injury.Current exercise recommendations for full-time manual wheelchair users have not resulted in significant changes in chronic disease, and this has been attributed in part to a lack of specificity in the prescription for manual wheelchair users who have less access to muscle mass during exercise.While activities that involve both pushing or pulling using the shoulder have been suggested [7], the specific exercise modes by which one should engage in such activities have not been definitively identified.
Handcycling has emerged as a promising exercise mode for wheelchair users [8] because the loads on the shoulder are lower than everyday wheelchair propulsion [9].Exercise guidelines (intensity, frequency, etc.) for handcycling remain to be established and require assessment of both cardiovascular stimulation and musculoskeletal safety.Musculoskeletal assessments, such as shoulder torque, provide objective measures of shoulder function and require measuring the external hand loads.However, unlike the variety of commercially available means for measuring ground reaction forces during legged walking, there are no off-the-shelf options for direct measurement of hand loads during handcycling.Custombuilt instrumented crank handles have been developed [9,10] but are not readily accessible outside of academic research settings.Enabled by the recent advances in wearable sensors that can measure accelerations and other kinematic variables, the field has turned towards the use of kinematic data to infer external loads via direct or indirect methods.
In the case of legged locomotion, the ground reaction force can be solved for directly using Newton's 2nd law if the kinematics of major body segments are measured [11,12].Both inertial measurement units and optical motion capture have been used to estimate ground reaction forces during walking, stair-use, and running [11][12][13][14][15].However, tracking all segments can be difficult and may not be feasible for every activity, thus motivating the use of statistical or machine learning models to predict external loads from subsets of kinematic data [13,16].
Segment accelerations have been used in neural network machine learning models to predict the ground reaction forces during legged walking [13].Glenohumeral joint reaction forces during wheelchair propulsion have been predicted using a combination of inertial measurement units and electromyography data in a bidirectional long short-term memory network architecture [16].To our knowledge, no study has reported kinematicbased predictions of external loads during handcycling.Therefore, the objective of this study was to evaluate the capacity of machine learning models to predict hand loads using segment accelerations.The capacity to measure external loads using kinematic data available from wearable sensors would enable field-based measurements and negate the need for specialized equipment.
To mimic data that could be measured via an inertial measurement unit, we based our analyses on accelerations of the radius segment.We evaluated two classes of machine learning models (neural network and ensemble) and compared their prediction performance against a statistical regression model.

Dataset
Biomechanics data of full-time wheelchair users (n = 20) completing a handcycling exercise protocol as part of a separate study [10] were used.For each participant, six propulsion cycles were extracted during moderate-intensity (45% peak power) handcycling and six cycles during high-intensity (90% peak power).A custom instrumented handle was used to measure the three forces and three moments applied at the right handcycle handle (sampling frequency = 2 kHz).Kinematic data of the right arm were recorded using a 10-camera motion capture system (Vicon, sampling frequency = 100 Hz).A rigid-body model [17] was scaled to each participant and used to determine the radius segment center of mass position and orientation using inverse kinematics (Figure 1).Data from each propulsion cycle were interpolated into 360 data points corresponding to the 360 degrees of crank rotation.
The segment positions were numerically differentiated twice using the central difference formula to obtain linear accelerations (⃗ a(t)) (Equation (1)): where ⃗ p is the radius center of mass position, t is the current time, and τ is the interval between data points.Orientations (Euler angles) were differentiated once to obtain angular velocities (⃗ ω(t)) (Equation ( 2)): where ⃗ e is the vector of Euler angles denoting the radius center of mass in the global coordinate system.

Model Objectives and Inputs
The objective of all models was to predict the continuous 6-degree-of-freedom applied loads at the handcycle handle given kinematic and anthropometric data.The training data for all models were based on 90% of participants (n = 18, randomly selected) while data from the remaining two participants were used as the test set.Data for model validation were based on a further split of the training dataset using n = 2 participants' data.Participant weight (lbs), wingspan (in), and exercise intensity were included with the kinematic data as model input features.Exercise intensity was defined as 0.5 or 0.9 to denote the varying protocol intensities.Prior to model training, each input feature was standardized by subtracting the mean and dividing by the standard deviation.This process was used to standardize the training, test, and validation datasets.

Model Overview and Descriptions
A classical statistical regression model and several machine learning models were compared, including five neural networks and two ensemble models (described below).Our choice of machine learning models was motivated by the work of others in predicting kinetic data [13,14,16] in addition to testing models of varying complexity in similar applications.A dense network was chosen because it is one of the simplest neural networks.Temporal convolutional convolutional networks (TCNs) have been used to predict times series data [18], and we chose to evaluate their performance along with a simpler convolutional neural network (CNN).Regression tasks have been performed with ensemble models such as a random forest (RF) algorithm [19] and the more complex RF-based gradient boosting machine (GBM).
Each of the neural network models was compiled using mean squared error as the loss function and the Adam algorithm as the optimizer (learning rate = 0.0001).Models were trained for a maximum of 1000 epochs with an early stop callback, which stopped training early if the loss function did not decrease by 0.01 in 50 epochs.A batch size of 16 propulsion cycles was used during training.Neural network and ensemble model hyperparameters were tuned using a heuristic approach beginning with default values prescribed by the machine learning libraries.Some model parameters, such as the number of layers in some models, were chosen to imitate models from the literature that were used for similar applications.Regularization was implemented using a heuristic approach to minimize model overfitting.All neural network models were created using Tensorflow [20], the random forest model was created with scikit-learn [21], and the gradient boosting model was created with XGBoost [22].All models were implemented using Python (v 3.9).
Linear regression: elastic net linear regression model (scikit-learn [21]), herein referred to as "Linear".The Linear model was trained with the least absolute shrinkage and selection operator (L1) and Tikhonov (ridge) (L2) regularization of coefficients.The Linear model parameters were α = 0.003, ρ = 0.5, and maximum training iterations = 2000.The model was fit by minimizing the objective function (Equation ( 3)): where ⃗ β is the vector of coefficients, X is the design matrix of predictors, ⃗ y is the vector of outputs, α is a constant that multiplies the penalty terms, and ρ is the ratio of L1 regularization to L2 regularization.Dense: dense neural network (Figure 2A) model with one hidden dense layer containing 200 nodes and L1/L2 regularization of 0.00001 applied to the kernel weights matrix.Batch normalization and dropout (45%) were applied after the hidden layer.The output layer was a dense layer with six nodes denoting the six output features.Model parameters for each of the neural network models are shown in Table 1.
Bidirectional long short-term memory (BiLSTM): two-layer BiLSTM model (Figure 2B).Each BiLSTM layer had 200 nodes with L1/L2 regularization (0.001) on the kernel weights matrix, recurrent kernel weights matrix, and bias vector.After each BiLSTM layer, batch normalization and dropout (45%) were applied.The output layer was a dense layer with six nodes.Convolutional neural network (CNN): two 1D convolutional layers (Figure 3A) were used to mimic the structure of the BiLSTM model.Each convolutional layer had 200 filters, a kernel size of 64, causal padding, rectified linear unit (ReLU) activation function, and L1/L2 kernel regularization (0.00001).After each convolutional layer, batch normalization and dropout (50%) were applied.The output layer was a time-distributed dense layer with six nodes.1.
Temporal convolutional network (TCN): six hidden layers with 2 n−1 dilation rate, where n is the layer number (Figure 3B).Each layer had 64 filters with a kernel size of 32 and L1/L2 regularization (0.01).After each layer was a ReLU activation layer followed by a dropout layer (40%).The output layer was a time-distributed dense layer with 6 nodes and L1/L2 kernel regularization (0.01).
Residual neural network (ResNet): forty layers of residual blocks (Figure 3C).Each residual block was made of a 1D convolutional layer starting with 32 filters and a kernel size of 32.The kernel size in the convolutional layer doubled every five layers.The convolutional layers used 'same' padding and had L2 regularization (0.001).After each convolutional layer, batch normalization and a ReLU activation layer were used.The output layer was a time-distributed dense layer with 6 nodes and L2 kernel regularization (0.001).
Random forest (RF) and gradient boosting model (GBM): two ensemble models were evaluated (Table 1).A random forest regressor [21] model was created with 400 decision trees, with each tree having a maximum depth of 10 levels.The second ensemble model was a gradient boosting machine (GBM) implemented using XGBoost [22].This model sequentially built 2000 decision trees, each with a maximum depth of 10, using a learning rate of 0.01.It employed both row and column subsampling at 70% and included L1 and L2 regularization (both set to 0.2).The GBM was trained for 1000 boosting rounds.

Statistical Analysis
For each of the 24 propulsion cycles in the test set, the median absolute percentage error (MdAPE) was calculated for each kinetic degree of freedom predicted.This analysis was repeated for the eight models tested.The Linear model was used as the baseline for comparison as it was the simplest model tested.
The distribution of model errors was non-normal for some degrees of freedom based on Shapiro-Wilkes tests for normality.Therefore, for each degree of freedom, a Kruskal-Wallis test was used to determine whether the errors differed between the models tested.If the differences were significant, Dunn's test for multiple comparisons was used to determine which models differed in performance compared with the Linear model.
The correspondence of the model prediction to the ground truth was also compared qualitatively by evaluating the shape of the predictions and quantitatively using Pearson's correlation coefficients.Kruskal-Wallis tests were used to determine whether the distributions of correlation coefficients for each degree of freedom differed between models, and a post hoc Dunn's test was used to determine differences compared to the Linear model.
Wilcoxon rank sum tests were used to compare predictions made by the best-performing model between the moderate and high-intensity propulsion cycles for each of the output features.Statistical analyses were performed using the SciPy package [23] in Python.

Results
Overall, errors in predictions of kinetics in the x-and y-directions, which contribute to the tangential and radial loads during handcycling, were lower (range: 32 to 56%) than kinetics in the out-of-plane z-direction (range: 45 to 101%).The Dense and RF models had significantly worse prediction performance (errors of 101 and 100%, respectively) than the Linear model for F z (66% error), and the RF model predictions were also significantly worse for T z .Most of the neural network models performed significantly better (error range: 38-42%) than the Linear model (error = 54%) for predicting F x (Table 2).The ResNet and TCN models performed better than the Linear model for F y (errors of 38 and 36%, respectively) and T x (errors of 32%).The GBM model predicted better than the Linear model for T y only (error = 39%).There were no other significant differences in prediction error compared to the Linear model.Of the neural network models, the TCN model had the lowest overall error, and the GBM model had the lowest error of the ensemble models.The range of error values for the best statistical, neural network (TCN), and ensemble (GBM)) models was smaller in the xand y-directions compared with the z-direction (Figure 4).The shape of the average prediction from the TCN model matched well with the ground truth for the in-plane kinetic degrees of freedom (Figures 5 and A1), with correlation coefficients ranging from 0.95 to 0.97 (Table 3).However, the model tended to underestimate the minima and maxima of each feature.For the x and y degrees of freedom, the errors tended to be highest at the beginning and end of the propulsion cycle.Force and torque profiles in the z-direction were poorly predicted.Finally, we evaluated whether model performance was sensitive to exercise intensity level.Errors in forces in the y-direction were moderately higher during high-intensity exercise (mean error = 41%) compared with moderate-intensity (mean error = 32%) (Figure 6).There were no significant differences in prediction performance between the two exercise intensities for the remaining degrees of freedom.

Discussion
We have shown that machine learning models can improve predictions of external loads during handcycling compared with a simpler elastic net linear regression model.Neural network models tended to perform better than ensemble models.Neural network predictions can leverage a longer time range of values, whereas tree-based ensemble models make instantaneous predictions at each timestep.Among the neural network models, convolutional neural networks outperformed recurrent (BiLSTM model) and dense neural network models.
Of all models tested, the predictive performance of the TCN was the most accurate for the kinetic parameters F x and F y .During handcycling, most force is applied in the tangential and radial directions with respect to the crank center and is a combination of the F x and F y forces.In contrast, forces applied in the lateral direction (F z ) are lower in magnitude.Thus, while the lateral forces and torques had higher percent errors, the error magnitude was relatively small.The poor ability to predict lateral forces and torques may be due to the variability in propulsion technique (shaded blue regions in Figure 5) [10], despite the fact that handcycling is a closed-chain task and relatively constrained in terms of motion.Individual techniques have been classified for manual wheelchair propulsion but have not been used in handcycling.The inclusion of propulsion style, either as a categorical variable or via the inclusion of additional segment kinematics as inputs, in future machine learning models may prove useful for improving kinetic predictions.Another option for improving model performance is to tune the hyperparameters of the neural network via grid searching though requires significant computational resources.Finally, although there were some differences in prediction performance between moderate-intensity and high-intensity propulsion cycles, a single model is likely sufficient for multiple exercise intensities provided the intensity data are used as an input to the model.
The prediction performance for handcycling loads was similar to other studies that predicted external loads from kinematic data during locomotion.Johnson et al. used a combination of optical motion capture kinematics and inertial measurement unit data to predict the ground reaction forces and moments during various athletic maneuvers and reported Pearson's correlation coefficients from r = 0.55 to 0.90 [13], similar to our results (r-range: 0.51 and 0.97) using the TCN model (Table 3).Liu et al. used joint angles from rigid body models to predict normalized ground reaction forces during stair use and reported Pearson's correlation coefficients between r = 0.908 and r = 0.991 [14].Normalizing the external loads reduces individual variability, which may increase model predictions, but is challenging to apply to data from persons with spinal cord injury.
There are a few limitations of this study.The size of the dataset used was relatively small, with 18 independent datasets used for training models.Remarkably, despite this small dataset, our best predictions were similar to others that used data from 80 subjects [14] for legged locomotion.One potential advantage of handcycling is the closed chain nature of the task which also may reduce the variability of the potential hand loads being predicted thus requiring less training data.The establishment of the minimum number of training datasets required for machine learning applications is the subject of ongoing research and may be task-dependent.Although the kinematic data used in this study were created to mimic that of a smartwatch-based inertial measurement unit, the data were derived from optical motion capture.Model hyperparameters were selected using a heuristic approach in this study to avoid the computational costs associated with alternative optimization methods such as the use of a grid search.While the use of a more advanced hyperparameter selection could be explored in future studies, we suggest that the inclusion of additional data or evaluation of alternative machine learning models should be prioritized to improve model accuracy.

Conclusions
The results of this study indicate that a TCN model could be used in place of an instrumented crank handle in future studies of handcycling propulsion biomechanics.However, it should be noted that the model is best suited for predicting loads that contribute to forward propulsion with more work needed to improve predictions in the lateral direction.Nonetheless, the ability to predict propulsive loads during handcycling now enables future research aimed at providing evidence-based exercise recommendations for people with spinal cord injuries by quantifying the external loads of different protocols.In addition to enabling musculoskeletal assessments of joint torques, the measurement of external loads is also needed to measure power during exercise.Power is among the most common metrics for assessing exercise effort.The machine learning model developed here now provides a means of assessing handcycling power using radius segment inertial measurements, such as those provided by a smartwatch.Measurements of power allow for research in the optimization of personalized exercise protocols (e.g., intensity, duration) based on an individual's peak power output.Establishing relevant power levels is critical for the success of exercise interventions aimed at improving the health of people with spinal cord injuries.Importantly, the use of wearable-type data combined with machine learning predictions of hand loads allows for the assessment of handcycling outside of the lab environment.
Future research should validate the use of these models with inertial measurement unit data and investigate the feasibility of using such models to predict handrim forces during manual wheelchair propulsion.While inertial measurement units enable field-based data collection that can be recorded continuously during physical activity, more work is needed to predict long-term continuous kinetics without the need for parsing individual propulsion cycles as carried out in this study.With these efforts, field-based measurements of kinetics will enable the needed advances to improve the health of wheelchair users via exercise.

Informed Consent Statement:
The data used in thus study was obtained with informed consent from all subjects involved in the study.

Figure 1 .
Figure 1.Motion-capture markers (pink spheres) used with subject-scaled rigid-body musculoskeletal model of upper arm.Radius segment (red circle) center of mass data derived from the radius position ⃗ p and orientation ⃗ e with respect to the global reference frame.

Figure 5 .
Figure 5. Average ground truth and average prediction of a propulsion cycle from the temporal convolutional network (TCN) model across each output feature.Predictions are colored orange and ground truth values are colored blue.Standard deviation ribbons are shown.
Author Contributions: Conceptualization, G.C.S., I.R. and M.E.K.; methodology, G.C.S. and M.E.K.; software, G.C.S. and M.L.; validation, G.C.S. and M.L.; formal analysis, G.C.S.; investigation, G.C.S., M.L.; resources, G.C.S. and K.M.H.; data curation, G.C.S. and M.L.; writing-original draft preparation, G.C.S. and M.L.; writing-review and editing, G.C.S., M.L., K.M.H., I.R. and M.E.K.; visualization, G.C.S. and M.E.K.; supervision, I.R. and M.E.K.; project administration, M.E.K.; funding acquisition, G.C.S. and M.E.K.All authors have read and agreed to the published version of the manuscript.Funding: This research was funded by the National Science Foundation Graduate Research Fellowship Program and the De Luca Foundation Seed for Science Initiative.Institutional Review Board Statement: The data used in this study was originally collected under approval of the Institutional Review Board of the University of Illinois (protocol 21552, 18 May 2022).