Novel Methods for Personalized Gait Assistance: Three-Dimensional Trajectory Prediction Based on Regression and LSTM Models

Enhancing human–robot interaction has been a primary focus in robotic gait assistance, with a thorough understanding of human motion being crucial for personalizing gait assistance. Traditional gait trajectory references from Clinical Gait Analysis (CGA) face limitations due to their inability to account for individual variability. Recent advancements in gait pattern generators, integrating regression models and Artificial Neural Network (ANN) techniques, have aimed at providing more personalized and dynamically adaptable solutions. This article introduces a novel approach that expands regression and ANN applications beyond mere angular estimations to include three-dimensional spatial predictions. Unlike previous methods, our approach provides comprehensive spatial trajectories for hip, knee and ankle tailored to individual kinematics, significantly enhancing end-effector rehabilitation robotic devices. Our models achieve state-of-the-art accuracy: overall RMSE of 13.40 mm and a correlation coefficient of 0.92 for the regression model, and RMSE of 12.57 mm and a correlation of 0.99 for the Long Short-Term Memory (LSTM) model. These advancements underscore the potential of these models to offer more personalized gait trajectory assistance, improving human–robot interactions.


Introduction
Since the development of robotic gait devices for support and rehabilitation began, enhancing human-robot interaction to offer more tailored solutions has been a primary area of research focus [1].This evolution is driven by the need to enhance embodiment and improve potential outcomes for assistance/rehabilitation tasks.To this end, a better understanding of human motion is required to successfully personalize the gait assistance provided by such devices.
Traditionally, references for gait trajectories in robotic assistance devices have been determined through pre-set patterns based on Clinical Gait Analysis (CGA) [2].While these methods have historically been widely applied as gait pattern reference trajectories across multiple robotic platforms [3-5], they are not without limitations.Specifically, such methods rely on predetermined trajectories that do not account for inter-individual gait variability due to morphological differences or gait condition variability, for example, walking speed or step length, parameters that provide a rigorous basis for estimating kinematic profiles [6].
Recently, however, the development of gait pattern generators has advanced significantly to address the variability in human gait and provide more personalized assistance solutions [7][8][9].Initially, these advancements involved the implementation of regressionbased models, which have been extensively applied to analyse measured gait kinematics across diverse conditions and multiple subjects.Yun et al. [10] explored angular gait pattern estimation using Gaussian Process Regressions (GPR), correlating 14 body parameters to joint motion in 113 subjects.Koopman et al. [11] utilized regression models to reconstruct the gait of 15 participants by dimensionally reducing their gait cycles to key-events.Most recently, Hu et al. [12] examined the use of Least Absolute Shrinkage and Selection Operator (LASSO) regression models to estimate Fourier coefficients that reconstruct lower-limb angle trajectories.While these models are invaluable for establishing a baseline understanding of gait kinematics and their variability under consistent conditions, they inherently lack the flexibility needed to adapt to transient changes within gait cycles.
To complement regression models and enhance adaptability towards transient gait variability, recent studies have explored the integration of Artificial Neural Network (ANN) techniques specializing in time-series forecasting for gait pattern estimation [13][14][15][16][17]. Luu et al. [14] evaluated a dimensionality reduction method that, when combined with Generalized Regression Neural Networks (GRNN), was applied to 70 adults walking at slow and normal velocities to predict sagittal plane angular trajectories for the hip and knee.Wu et al. [18] used Gaussian regression-based models to map relationships between body parameters and gait features, reconstructing gait trajectories using encoder-decoder neural network approaches for joint angle estimations in the sagittal plane.Further research has investigated specialized Recurrent Neural Networks (RNNs), including Gated Recurrent Units, to predict angular gait trajectories from 137 subjects [19].Zaroug et al. [15] assessed the performance of another RNN type known as Long Short-Term Memory network, training the model with six participants who walked at a consistent speed (5 km/h) using a sliding window approach to forecast the linear accelerations and angular velocities of the thigh and shank, achieving a correlation of 0.98 with the testing set.Su et al. [17] explored Long Short-Term Memory (LSTM)-based models for predicting gait phases and trajectories, achieving similar accuracy in predictions of angular velocities 100-200 ms in advance for thigh, shank, and foot segments using IMU data from 12 subjects walking at five different speeds.Additionally, Haneul et al. [20] examined gait phase recognition using a bi-directional LSTM, utilizing IMU data from sensors on the shank and feet of three subjects, achieving an accuracy of 86.43%.This growing interest in the development of more accurate gait pattern generators recognizes the need for tools which can provide not only robust baseline estimates of gait patterns but also dynamically adapt to the changing conditions of the individual during walking.
In the broader context of robotic gait rehabilitation platforms, both regression and ANN-based models play crucial roles in generating reference trajectories.Regression models are invaluable for creating stable and reliable trajectories, especially in scenarios involving patients with severely affected gait patterns or when precise historical gait data is lacking.These models are particularly effective as they ensure rehabilitation can progress effectively, even in the absence of reliable patient-specific data.Conversely, ANN models excel at handling real-time variability, which makes them ideal for fine-tuning adaptations to match the specific kinematics of healthy or less affected individuals whose movements are consistently reliable.However, ANNs face the significant challenge of the 'black box' phenomenon, where the decision-making processes remain opaque.Additionally, LSTM models, a type of ANN, require substantial computational resources, making them less feasible for real-time applications on hardware with limited processing power.This limitation is particularly problematic for very affected subjects whose unreliable kinematics demand high accuracy and safety in rehabilitation.The opacity of ANNs thus poses a considerable hurdle in clinical applications that require high levels of transparency and accountability.However, one notable limitation persists across the aforementioned gait pattern generators, as they solely focus on estimating angular-space trajectories.While effective within their scope, these models fall short when applied to end-effector robotic systems, which require precise control over the three-dimensional spatial positioning of joints.Although there are some works related to three-dimensional pose estimation of human joints, these are primarily oriented toward applications that do not necessitate the clinical precision required for gait rehabilitation, such as pedestrian pose estimation for autonomous system safety, surveillance, and security during human-robot collaboration [21][22][23].Few works have successfully bridged the gap between angular gait generators for gait rehabilitation purposes and comprehensive three-dimensional spatial trajectory predictions [24,25].
In response to this limitation and literature gap, we here introduce a novel approach that expands the state-of-the-art application of regression and ANN models beyond angular estimations to include precise three-dimensional spatial predictions.Specifically, our work studies multivariable regression techniques and LSTM models to provide dynamic and three-dimensional gait pattern predictions for end-effector gait assistance devices.We first present a multivariable regression model designed to handle the three-dimensional variability of human motion across the lower joints (hip, knee and ankle).This model utilizes the kinematic data from 42 individuals walking at eight speeds.The kinematic data were reduced to key-points which could then be used to train robust multivariable regression models for the reconstruction of gait trajectories.Complementing the regression model, we next present the integration of an LSTM network designed to predict one-stepahead gait trajectories, corresponding with the future 100 time-normalized points given data from the previously measured 100 points.

Gait Dataset Description
This study utilized a publicly available gait dataset [26], which consists of 42 healthy individuals categorized into two cohorts: 24 young adults (mean age 27.6 years, height 171.1 cm, weight 68.4 kg) and 18 older adults (mean age 62.7 years, height 161.8 cm, weight 66.9 kg); detailed demographic and physical characteristics are presented in Table 1.Each participant was instrumented with 28 markers on the pelvic and lower-extremity segments, adhering to the skin-mounted marker protocol by Leardini et al. [27].Joint trajectories were captured using a 12-camera high-speed motion-capture system at 150 Hz (Raptor-4, Motion Analysis Corporation, Santa Rosa, CA, USA) while ground reaction forces were measured using a dual-belt instrumented treadmill at 300 Hz (FIT, Bertec, Columbus, OH, USA).For each participant session, a total of eight trials were recorded.During these trials, participants were asked to walk at eight gait speeds, derived from their self-selected comfortable walking speed using the Froude number [28] as follows: 40%, 55%, 70%, 85%, 100%, 115%, 130%, and 145%.Each walking trial was recorded for 90 s, wherein data from the final 30 s was analysed as stationary walking.Visual overviews of these conditions are provided in Figure 1.

Age
For each participant session, a total of eight trials were recorded.During these trials, participants were asked to walk at eight gait speeds, derived from their self-selected comfortable walking speed using the Froude number [28] as follows: 40%, 55%, 70%, 85%, 100%, 115%, 130%, and 145%.Each walking trial was recorded for 90 s, wherein data from the final 30 s was analysed as stationary walking.Visual overviews of these conditions are provided in Figure 1.

Gait Analysis
Marker positions and ground-reaction forces were processed using MATLAB© version 2020b to derive joint kinematics and segment the trajectories into time-normalized gait cycles.Initially, joint positions were calculated relative to the motion capture system's origin (O).To isolate the joint movements relative to the body and accommodate for treadmill motion, these positions were then adjusted to reflect their placement relative to the subject's pelvis (P), see Figure 1.This correction ensures the joint trajectory data accurately represents the actual movement of the body, independent of the treadmill's influence.
Gait cycle normalization involved defining the start and end of each cycle using heel contact, identified from vertical force plate reactions when exceeding a threshold, see Figure 2a.The trajectories of both legs were segmented at these points to mark 0% and 100% of the gait cycle.Steps whose time duration deviated significantly from the norm, falling outside the 25-75% interquartile range of cycle duration of the trial, were excluded as outliers to ensure data integrity.This process resulted in a dataset comprising 15,531 steps, each carefully vetted for consistency and accuracy in representing gait dynamics.For a detailed visualization of these segmented trajectories, refer to Figure 2b.

Gait Analysis
Marker positions and ground-reaction forces were processed using MATLAB© version 2020b to derive joint kinematics and segment the trajectories into time-normalized gait cycles.Initially, joint positions were calculated relative to the motion capture system's origin (O).To isolate the joint movements relative to the body and accommodate for treadmill motion, these positions were then adjusted to reflect their placement relative to the subject's pelvis (P), see Figure 1.This correction ensures the joint trajectory data accurately represents the actual movement of the body, independent of the treadmill's influence.
Gait cycle normalization involved defining the start and end of each cycle using heel contact, identified from vertical force plate reactions when exceeding a threshold, see Figure 2a.The trajectories of both legs were segmented at these points to mark 0% and 100% of the gait cycle.Steps whose time duration deviated significantly from the norm, falling outside the 25-75% interquartile range of cycle duration of the trial, were excluded as outliers to ensure data integrity.This process resulted in a dataset comprising 15,531 steps, each carefully vetted for consistency and accuracy in representing gait dynamics.For a detailed visualization of these segmented trajectories, refer to Figure 2b.

Regression-Based Gait Pattern Generator
Human gait represents a complex sequence of movements, intricately coordinated and varying significantly among individuals due to unique physical conditions and differences in walking speeds [8].In response to this diversity and the requirement for personalized gait analysis, our approach uses multivariable regression methods to reconstruct the three-dimensional gait trajectories.

Gait Key-Points
To manage the inherent complexity in gait data, we adopted a down-sampling strategy similar to those presented in [11,24].Initially, for each subject, all steps within a walking trial were averaged, resulting in nine averaged trajectories per subject's session, corresponding to the X, Y, and Z axes of the hip, knee, and ankle of both legs.Subsequently, various key-points (k) representing the positions and velocities of each joint along each of their axes were extracted from the averaged trial data.Each key-point is parametrized as

Regression-Based Gait Pattern Generator
Human gait represents a complex sequence of movements, intricately coordinated and varying significantly among individuals due to unique physical conditions and differences in walking speeds [8].In response to this diversity and the requirement for personalized gait analysis, our approach uses multivariable regression methods to reconstruct the threedimensional gait trajectories.

Gait Key-Points
To manage the inherent complexity in gait data, we adopted a down-sampling strategy similar to those presented in [11,24].Initially, for each subject, all steps within a walking trial were averaged, resulting in nine averaged trajectories per subject's session, corresponding to the X, Y, and Z axes of the hip, knee, and ankle of both legs.Subsequently, various key-points (k) representing the positions and velocities of each joint along each of their axes were extracted from the averaged trial data.Each key-point is parametrized as k(t, y), with t representing the percentage of the gait cycle at which the key-point occurred, and y representing the position in the corresponding axis.
The number of key-points selected for each joint axis varied according to the complexity of the trajectory, with a total of 66 key-points retained, as detailed in Tables A1-A3 and in Figure 3a.Selection was based on an analytical process assessing the contribution of each point to the overall gait pattern shape.A stepwise regression was conducted to evaluate the selected points and verify their influence on characterizing gait velocity and height.Statistical significance was evaluated at an alpha level of 0.05.curred, and y representing the position in the corresponding axis.
The number of key-points selected for each joint axis varied according to the complexity of the trajectory, with a total of 66 key-points retained, as detailed in Tables A1-A3 and in Figure 3a.Selection was based on an analytical process assessing the contribution of each point to the overall gait pattern shape.A stepwise regression was conducted to evaluate the selected points and verify their influence on characterizing gait velocity and height.Statistical significance was evaluated at an alpha level of 0.05.

Regression Analysis on Key-Points
To estimate the characteristic three-dimensional trajectories, robust regressions [29] with bi-square weighting functions were applied to predict the parameter values (y, t) of the previously selected key-points (k).The regression formula employed was as follows: where β i are regression coefficients with i in [1,3], l represents the subject height, and v represents gait velocity.The squared v term was included in this equation to capture the non-linearities of joint kinematics [9].
This approach aims to pinpoint characteristic parameters from within the key events that enable the prediction of joint trajectories, taking into account participant-specific conditions such as height and gait velocity, Figure 3b.

Trajectory Reconstruction
To obtain the final three-dimensional trajectories, spline interpolations are used to join the estimated key-points and reconstructed trajectories, ensuring a smooth and continuous linkage of key-points, thereby accurately recreating the entire gait cycle as illustrated in Figure 4.
the previously selected key-points (k).The regression formula employed was as follows: where β are regression coefficients with i in [1,3],  represents the subject height, and  represents gait velocity.The squared  term was included in this equation to capture the non-linearities of joint kinematics [9].
This approach aims to pinpoint characteristic parameters from within the key events that enable the prediction of joint trajectories, taking into account participant-specific conditions such as height and gait velocity, Figure 3b.

Trajectory Reconstruction
To obtain the final three-dimensional trajectories, spline interpolations are used to join the estimated key-points and reconstructed trajectories, ensuring a smooth and continuous linkage of key-points, thereby accurately recreating the entire gait cycle as illustrated in Figure 4.

LSTM-Based Gait Pattern Generator
Long Short-Term Memory (LSTM) networks are a specialized subtype of Recurrent Neural Networks that excel in time-series analysis due to their specific architecture, designed to address the challenges of their long-term dependencies [30].LSTMs incorporate a sophisticated cell structure with multiple gates that manage the flow of information, allowing the network to retain or discard data dynamically, see Equations ( 2)-( 7).In the

LSTM-Based Gait Pattern Generator
Long Short-Term Memory (LSTM) networks are a specialized subtype of Recurrent Neural Networks that excel in time-series analysis due to their specific architecture, designed to address the challenges of their long-term dependencies [30].LSTMs incorporate a sophisticated cell structure with multiple gates that manage the flow of information, allowing the network to retain or discard data dynamically, see Equations ( 2)- (7).In the context of gait analysis, LSTMs offer potential advantages due to their ability to capture temporal dependencies and variability in gait patterns.
where f t is the forget gate at time t, σ is the sigmoid function, W represents the weight matrices of the gates, h t−1 is the previous hidden state, x t is the current output, b x are the biases of the gaits, i t is the input gate, ∼ C t is the candidate for addition to the cell state, C t is the new cell state at time t, o t is the output gate's activation at time t and h t is the hidden state at time t.

Data Preparation and Processing
For LSTM analysis, the 3D kinematic trajectories obtained from the aforementioned gait dataset (previously described in Section 2.1) were utilized to train, validate and test the model.Data were normalized based on the maximum and minimum values of each joint axis, height, and gait speed.To facilitate the LSTM network's performance, the dataset was strategically partitioned by assigning 60% of the walking sessions to the training set, 10% to the validation set, and the remaining 30% to the testing set, ensuring no overlap in participants between these sets.

LSTM Model Architecture
The model, as depicted in Figure 5, was designed to facilitate one-step-ahead predictions of gait trajectories.Its input layer is specifically formulated to receive sequences of 100 points (corresponding to a full gait cycle), each encompassing 11 distinct features.These input features represent the last known gait cycle data, including the subject's height, gait velocity, and the position coordinates (x, y, z) of the hip, knee and ankle.Inside the model, two principal LSTM layers of 120 hidden units each and separated by two dropout layers (set at 0.2 to prevent overfitting) encode and decode the input data.The high-level features learned by the LSTMs are fed into a fully connected layer, translating them into sequences of 100 points of nine distinct output features corresponding to predictions of the 3D positions of the lower-extremity joints during the future gait cycle, see Figure 6.

Model Training
The LSTM model was trained on 60% of the walking sessions using the Adam optimizer [31].The training involved 20 epochs, utilizing a mini-batch size of 256.The initial learning rate was set at 0.001 and adjusted dynamically according to a piecewise schedule every 125 epochs, reducing by a factor of 0.2.To mitigate the risk of exploding gradients, gradient clipping was employed with a threshold set at 1. To monitor and enhance model performance, validation checks were conducted every 1000 batches using 10% of the walking sessions designated as validation data, see Figure 7. Early stopping was implemented to halt training if no improvement in validation loss was observed for five consecutive epochs, thereby preventing overfitting.

Evaluation of Gait Models Accuracy
To evaluate the performance of the regression and LSTM models in estimating 3D gait trajectories of the ankle, knee, and hip, root mean squared error (RMSE) and Pearson's correlation coefficient (ρ (y, ŷ) ) were used as evaluation metrics over the testing datasets (30% of the full dataset), defined as ( 8)-( 9): The evaluation of the RMSE and correlations was conducted at various levels.The global model performance can be captured by the overall RMSE and correlation coefficient, indicated as RMSE O and ρ O .At a more detailed level, the accuracy for each joint's estimation in the X, Y and Z axes are presented by RMSE J and ρ J .Variability across different walking trials was also examined, RMSE

Model Training
The LSTM model was trained on 60% of the walking sessions using the Adam optimizer [31].The training involved 20 epochs, utilizing a mini-batch size of 256.The initial learning rate was set at 0.001 and adjusted dynamically according to a piecewise schedule every 125 epochs, reducing by a factor of 0.2.To mitigate the risk of exploding gradients, gradient clipping was employed with a threshold set at 1. To monitor and enhance model

Evaluation of Gait Models Accuracy
To evaluate the performance of the regression and LSTM models in estimating 3D gait trajectories of the ankle, knee, and hip, root mean squared error (RMSE) and Pearson's correlation coefficient (ρ ( , ) ) were used as evaluation metrics over the testing datasets (30% of the full dataset), defined as (8-9): The evaluation of the RMSE and correlations was conducted at various levels.The global model performance can be captured by the overall RMSE and correlation coefficient, indicated as RMSE and ρ .At a more detailed level, the accuracy for each joint's estimation in the X, Y and Z axes are presented by RMSE and ρ .Variability across different walking trials was also examined, RMSE and ρ , to assess the model's robustness to changes in trial-specific conditions (height and gait velocity).Additional step-level performance was analysed through  = RMSE |  = 1, … ,  and  = ρ |  = 1, … ,  , where  represents the number of steps per trial.

Regression-Based Model
All selected key-points exhibited similar dependencies for variations in height and gait speed, as presented in Tables A1-A3.Of the 66 key-points, the key-point position parameters (y) showed statistically significant relationships (p < 0.05) with gait speed for 56 key-points, and with subject height for 54 key-points.For the time parameter (t), 38 keypoints showed a significant relationship with gait speed, while 44 key-points were significantly related to height.These results indicate a comparable dependence on both factors, gait speed and body height on joint trajectories, which has not previously been reported in other angular generator models referenced in the literature [11].The adjusted parameters from the robust regressions across all key-points for each joint axis are detailed in

Regression-Based Model
All selected key-points exhibited similar dependencies for variations in height and gait speed, as presented in Tables A1-A3.Of the 66 key-points, the key-point position parameters (y) showed statistically significant relationships (p < 0.05) with gait speed for 56 key-points, and with subject height for 54 key-points.For the time parameter (t), 38 key-points showed a significant relationship with gait speed, while 44 key-points were significantly related to height.These results indicate a comparable dependence on both factors, gait speed and body height on joint trajectories, which has not previously been reported in other angular generator models referenced in the literature [11].The adjusted parameters from the robust regressions across all key-points for each joint axis are detailed in Tables A1-A3.We report RMSE errors ranging from 0.01 m to 0.06 m for position (y) estimations and from 0.6% to 13.17% for time (t) estimations.
Model accuracy was evaluated by comparing the reconstructed trajectories against the testing data.The overall root mean squared error (RMSE O ) was recorded at 13.40 mm, with an overall correlation coefficient (ρ O ) of 0.92, as highlighted in Table 2.At the joint level, detailed in Figure 8 and Table 2, the highest RMSE J value of 33.93 mm was recorded at the ankle-x axis, while the lowest error of 10.65 mm was observed at the hip-y axis.Despite the higher RMSE observed at the ankle joint, the strongest correlations, up to ρ J = 0.99, were also noted at the ankle (x, z axis) and also at the knee-x axis.This suggests that while the model effectively captures the directional trends of ankle movement, the large range of motion at this joint may amplify absolute errors, resulting in higher RMSE values.The weakest correlations were recorded at the hip y-axis with a value of ρ J = 0.71 primarily due to the nature of the variability of this joint and axis.Further analysis of the model robustness under varying trial-specific conditions revealed consistently strong performance, with mean RMSE T values below 30 mm across all speeds and subject heights in the testing set.However, performance noticeably decreased near the boundary values of the dependent variables, as illustrated in Figures A1-A4.Tables A1-A3.We report RMSE errors ranging from 0.01 m to 0.06 m for position (y) estimations and from 0.6% to 13.17% for time (t) estimations.Model accuracy was evaluated by comparing the reconstructed trajectories against the testing data.The overall root mean squared error (RMSE ) was recorded at 13.40 mm, with an overall correlation coefficient (ρ ) of 0.92, as highlighted in Table 2.At the joint level, detailed in Figure 8 and Table 2, the highest RMSE value of 33.93 mm was recorded at the ankle-x axis, while the lowest error of 10.65 mm was observed at the hip-y axis.Despite the higher RMSE observed at the ankle joint, the strongest correlations, up to ρ = 0.99, were also noted at the ankle (x, z axis) and also at the knee-x axis.This suggests that while the model effectively captures the directional trends of ankle movement, the large range of motion at this joint may amplify absolute errors, resulting in higher RMSE values.The weakest correlations were recorded at the hip y-axis with a value of ρ = 0.71 primarily due to the nature of the variability of this joint and axis.Further analysis of the model robustness under varying trial-specific conditions revealed consistently strong performance, with mean RMSE values below 30 mm across all speeds and subject heights in the testing set.However, performance noticeably decreased near the boundary values of the dependent variables, as illustrated in Figures A1-A4.

LSTM-Based Model
Subsets of 100 points from the testing dataset served as inputs for the model performance evaluation.Comprehensive performance, characterized by the overall root mean squared error (RMSE O ) and Pearson correlation coefficient (ρ O ), yielded values of 12.57 mm and 0.99, respectively.Joint-specific evaluations were conducted for all axes, revealing RMSE J values below 27.39 mm, and correlation coefficients (ρ J ) above 0.95.When tested under varied walking conditions and subject characteristics, continued low RMSE errors with marginal error amplification at the boundaries of height and velocity parameters were observed, as can be observed in Figures A1-A4.Additionally, model performance at the step level was analysed through the computation of RMSE across all testing steps.As indicated in Table 3, the mean RMSE values (R) and mean correlations (C) for all steps at each joint axis were found to be similar to the RMSE J values, yet it is noted that the standard deviations at the step-level were far greater than those reported at the joint level.

Discussion
This study contributes to the evolving landscape of gait pattern generator models by integrating regression-based techniques and LSTM networks in order to predict threedimensional gait patterns with high precision and adaptability.
The presented regression model demonstrated strong predictive accuracy closely aligned with state-of-the-art angular gait generators when predicting joint trajectories of the lower limbs: Koopman et al. [9] reported an overall correlation coefficient for all joints of 0.91 (ours was 0.92); Hu et al. [12] reported R 2 values between 0.68 and 0.98 for the hip, knee and ankle (while those herein reported ranged from 0.5 to 0.98).An RMSE comparison with the aforementioned methods is not possible given a lack of consistency between reported outcomes and their normalization between studies.Regarding state-ofthe-art three-dimensional gait pattern estimators: in our previous work [24], we obtained an overall RMSE of 25.73 mm compared to 13.4 mm obtained in the presented regression model; Di et al. [25] reported RSME for the ankle x, y and z axis of 59.1, 39.1 and 33.0 mm, respectively (ours were 33.93, 12.34 and 15.52 mm).
The LSTM-based model defined in our study showcased superior adaptability to temporal variations, reflecting an overall correlation coefficient of 0.99 and RMSE O of 12.57 mm, with joint-specific RMSE J values ranging from 2.84 to 27.30 mm and correlation coefficients ρ J between 0.92-0.99.These results are similar to findings from previous studies; Luu et al. [14] reported correlation coefficients of 0.98 and 0.97 for predictions of angular trajectories of the hip and knee in the sagittal plane, respectively; Wu et al. [18] documented similar correlations with coefficients ranging from 0.97 to 0.99 again for the hip and knee movements in the sagittal plane.Despite a lower correlation coefficient of 0.92 reported at the ankle-y axis in our model, it is noted that this measure pertains to motion in the frontal plane, which was not evaluated in the aforementioned studies.Sagittal plane estimations (x and z axes for all joints) in our model ranged from 0.97 to 0.99, underscoring consistency with, and in some respects, enhancement over the results reported in prior research.
Evaluation of the model accuracies in capturing variability across different walking trials revealed that while both proposed methods demonstrated reliable performance for conditions within the mid-range spectrum of walking speeds and subject heights, the regression model exhibited unique limitations.Specifically, it exhibited reduced accuracy for trials at the dataset's extremes of subject heights (147-192 cm) and gait velocities (1.29-8.02km/h), as evidenced by lower correlations and higher errors, which are detailed in Figures A1-A4.In contrast, the LSTM model maintained consistent performance, with only slight precision decreases at these boundary conditions.It is noted that both models consistently underperformed in predicting trajectories in the y-axis in comparison to the x and z axes.This may be attributable to inherently greater variability in lateral movements during walking, which are often less constrained and more prone to personspecific patterns, hence presenting a more complex prediction task for both regression and LSTM methodologies.
When examining the predictive performance in estimating intra-trial variability, the LSTM model outperformed the regression model predictions as expected, as can be seen in Table 3.Although both models display a similar percentage of outliers, the LSTM model achieves lower mean RMSE (R), standard deviations, and more narrow interquartile ranges, indicating less distribution of prediction errors.The LSTM model's strengths can also be observed through the average correlations (C) it achieved, which also exceeds that of the regression model.
Although the differences in RMSE and correlation between models are noteworthy, it is essential to understand that each model has specific applications and offers solutions to a common problem from two different approaches.The regression model provides a robust solution in scenarios where data is limited or computational resources are constrained.Conversely, the LSTM model offers superior adaptability and accuracy in real-time dynamic environments.Together, both models enhance the overall effectiveness of gait rehabilitation systems by addressing different needs and constraints.
The value of the regression model becomes particularly evident in challenging scenarios for gait forecasting, such as with patients who have severely affected gait patterns and limited or unavailable gait data, or when computational resources are restricted.This model generates stable and reliable trajectories by relying solely on the user's morphological values and desired speed, ensuring that end-effector gait rehabilitation robotic devices can effectively and safely guide the generated gait trajectories.Conversely, in situations where users are healthy or not severely affected and have available kinematic context data, the LSTM model excels.It adapts to real-time changes in gait dynamics, making it ideal for forecasting new trajectories and personalizing reference gait trajectories to finely tune to the evolving needs of the users.However, this comes at a greater computational cost.
This study has effectively demonstrated that both the presented regression model and the LSTM network are capable of estimating three-dimensional gait patterns at or above the level of current literature benchmarks in angular gait pattern generation [11,12,14,18].These models provide precise estimations of gait variability across differing gait conditions, highlighting their potential utility for tailored gait rehabilitation and assistance.
In summary, while both the regression and LSTM models demonstrate robust performance comparable to current state-of-the-art methods in gait analysis, they are inherently limited by the range of heights (147-192 cm), gait velocities (1.29-8.02km/h), and ages included in the dataset.This limitation suggests that the models may not generalize well to gait patterns outside of these observed conditions.The regression model, chosen for its balance of performance, adaptability, and computational cost, provides an efficient solution under conditions of limited data or resources.In contrast, the LSTM model is favoured for its real-time adaptability, though it demands higher computational power.Addressing the disparity between these models, there is a clear opportunity to explore the potential of more advanced computational architectures.Such models could be designed to enhance adaptability and accuracy while maintaining a manageable computational footprint.This approach would not only strive for a balance between performance and computational efficiency but also ensure that these solutions fit within the hardware constraints currently prevalent in gait rehabilitation systems.
Future work should focus on expanding the dataset to encompass a broader spectrum of subject characteristics, such as a wider range of ages and more diverse gait speeds.Additionally, integrating bilateral limb data could refine the estimation of inter-limb variability and balance, enhancing the comprehensiveness of gait analysis systems.Moreover, addressing the disparity between the current models suggests a clear opportunity to explore the potential of more advanced computational architectures.These should be carefully designed to enhance adaptability and accuracy while balancing the trade-off between computational efficiency and performance, fitting within the existing hardware constraints of gait rehabilitation systems.Beyond these constraints, pursuing precision without the limitations of current hardware, future research could also consider employing more computationally demanding models, such as larger and more complex Transformers.These models, known for their superior performance in tasks requiring contextual understanding, could lead to significant advancements in the precision and adaptability of gait analysis models [32].

Conclusions
This study introduces a dual approach for generating gait patterns, utilizing both regression-based and LSTM-based models to estimate three-dimensional gait trajectories.We aimed to extend beyond traditional gait pattern generators for robotic gait assistant purposes, which only focus on estimating angular parameters, to provide a comprehensive approach for three-dimensional joint gait pattern estimation.
Both presented gait generators, regression-based and LSTM-based, achieved accuracies comparable to state-of-the-art models, offering precise estimations of gait variability across a range of gait speeds (1.29-8.02km/h) and subject heights (147-192 cm).The presented regression-based model is particularly effective for estimating consistent predictions for specific gait conditions.It is ideally suited for gait rehabilitation in patients with significant physical impairment, where gait patterns may be unstable, severely affected, or virtually absent.Alternatively, the LSTM model excels at managing real-time, dynamic gait changes and is likely to be more beneficial in scenarios requiring advanced gait rehabilitation or assistance.In these settings, user gait trajectories are likely to be stable, and the task of the robotic system is therefore to adapt to inherent dynamic variability to enhance human-robot interaction.
Both models demonstrate significant potential to enhance end-effector robotic devices by providing them with the capability to assist gait trajectories through a more personalized approach.This dual-model strategy not only offers a robust solution to the complexities of gait analysis but also enriches the potential for tailored therapeutic applications, significantly improving outcomes for a diverse range of patients.

Figure 1 .
Figure 1.Experimental setup diagram illustrating the location of markers on the subject's body and force plates positions.

Figure 1 .
Figure 1.Experimental setup diagram illustrating the location of markers on the subject's body and force plates positions.

Figure 2 .
Figure 2. Comprehensive Gait Analysis Methodology and Kinematic Data Visualization.(a) Data Segmentation graph depicting how gait cycles are delineated using heel strike events detected by ground reaction forces on the force plate, with step transitions highlighted over time.(b) Segmented kinematics displaying the trajectory of joint positions and velocities compiled from all trials across the subject cohort.
(a) Kinematics segmentation based on heel strike detection.(b) Representation of the segmented kinematics (positions and velocities) for all joints and axes.

Figure 2 .
Figure 2. Comprehensive Gait Analysis Methodology and Kinematic Data Visualization.(a) Data Segmentation graph depicting how gait cycles are delineated using heel strike events detected by ground reaction forces on the force plate, with step transitions highlighted over time.(b) Segmented kinematics displaying the trajectory of joint positions and velocities compiled from all trials across the subject cohort.
(a) Multidimensional visualization of all the gait key-points (K) for all joint axes.(b) Results of regression analysis at the six key-points of the ankle X-trajectory with gait velocity as the independent variable.

Figure 3 .
Figure 3. Multidimensional analysis of gait key-points along all joints and axes and example of regression-based key-points estimations: (a) the spread of key-points for the hip, knee, and ankle joints across the X, Y, and Z throughout various gait cycles and velocities for the entire subject group; (b) example of key-point estimation for the ankle trajectory in X, correlating joint position and gait cycle percentage of each key-point with gait speed through regression-based analysis.

Figure 3 .
Figure 3. Multidimensional analysis of gait key-points along all joints and axes and example of regression-based key-points estimations: (a) the spread of key-points for the hip, knee, and ankle joints across the X, Y, and Z throughout various gait cycles and velocities for the entire subject group; (b) example of key-point estimation for the ankle trajectory in X, correlating joint position and gait cycle percentage of each key-point with gait speed through regression-based analysis.

Figure 4 .
Figure 4. Comparative visualization of measured and reconstructed gait trajectory for a representative subject.The original, measured 3D joint trajectories (solid lines) are presented alongside the estimated trajectories (dashed lines) generated through regression-based models.The spline reconstruction method is applied to the estimated key-points (displayed in black), demonstrating the smooth and close approximation of the reconstructed path to the measured trajectory.

Figure 4 .
Figure 4. Comparative visualization of measured and reconstructed gait trajectory for a representative subject.The original, measured 3D joint trajectories (solid lines) are presented alongside the estimated trajectories (dashed lines) generated through regression-based models.The spline reconstruction method is applied to the estimated key-points (displayed in black), demonstrating the smooth and close approximation of the reconstructed path to the measured trajectory.

23 Figure 5 .
Figure 5. Schematic of the LSTM Neural Network to predict 3D positions of the lower joints one step ahead.Sequences of 11 features are received by the input layer, including height, gait velocity, and the positions of all lower extremity joints in the space.The network comprises two LSTM layers, each with 120 hidden units connected by dropout layers with a rate of 0.2 to prevent overfitting by randomly omitting features during training.The final output sequence represents the 3D positions of the nine joints corresponding to the subsequent 100 time-normalized points.

Figure 5 .
Figure 5. Schematic of the LSTM Neural Network to predict 3D positions of the lower joints one step ahead.Sequences of 11 features are received by the input layer, including height, gait velocity, and the positions of all lower extremity joints in the space.The network comprises two LSTM layers, each with 120 hidden units connected by dropout layers with a rate of 0.2 to prevent overfitting by randomly omitting features during training.The final output sequence represents the 3D positions of the nine joints corresponding to the subsequent 100 time-normalized points.

Figure 6 .
Figure 6.Results of LSTM Predictions for successive gait cycles.This figure illustrates the presented LSTM network's capability to predict 3D joint trajectories.The known joint positions are delineated as light blue lines, LSTM's predictions are represented as dashed black lines, and the true future positions are shown as teal lines.The input sequence depicting the last 100 time-normalized data points is determined by the shaded light blue areas.In contrast, the network's prediction output for the given input is highlighted in the teal areas.The charts across the X-Z and Y-Z planes, alongside the temporal sequences for the X, Y, and Z axes, collectively affirm the model's fidelity in replicating the intricate, rhythmic motions characteristic of human ambulation.

Figure 6 .
Figure 6.Results of LSTM Predictions for successive gait cycles.This figure illustrates the presented LSTM network's capability to predict 3D joint trajectories.The known joint positions are delineated as light blue lines, LSTM's predictions are represented as dashed black lines, and the true future positions are shown as teal lines.The input sequence depicting the last 100 time-normalized data points is determined by the shaded light blue areas.In contrast, the network's prediction output for the given input is highlighted in the teal areas.The charts across the X-Z and Y-Z planes, alongside the temporal sequences for the X, Y, and Z axes, collectively affirm the model's fidelity in replicating the intricate, rhythmic motions characteristic of human ambulation.

Figure 7 .
Figure 7. RMSE and Loss learning curves during the LSTM training.Solid lines represent the training set and dashed lines the validation set.

Figure 7 .
Figure 7. RMSE and Loss learning curves during the LSTM training.Solid lines represent the training set and dashed lines the validation set.

Figure 8 .
Figure 8. Validation of regression and LSTM-based models.Figures in the first row display the true gait trajectories of the training set in solid blue lines alongside the model predictions in dotted dashed black lines in both the sagittal and frontal planes, illustrating the models' fidelity in spatial trajectory estimation.The second row contains bar graphs representing the correlation coefficients and RMSE metrics for each joint index, providing a quantitative assessment of each model's performance.

Figure 8 .
Figure 8. Validation of regression and LSTM-based models.Figures in the first row display the true gait trajectories of the training set in solid blue lines alongside the model predictions in dotted dashed black lines in both the sagittal and frontal planes, illustrating the models' fidelity in spatial trajectory estimation.The second row contains bar graphs representing the correlation coefficients and RMSE metrics for each joint index, providing a quantitative assessment of each model's performance.

Figure A1 .
Figure A1.Comparative analysis of gait prediction error accuracy using regression-based and LSTM-based models under velocity variations.This figure displays box plots illustrating the RMSE (Root Mean Square Error) values for both regression-based and LSTM-based models across varying velocities (km/h) for the hip, knee, and ankle joints in the X, Y, and Z axes.Each box plot details the RMSE computed from true and estimated joint trajectories, highlighting the median, quartile ranges (Q1, Q3), and outliers.This visualization aids in evaluating the models' precision in estimating gait movements under different walking speeds.

Figure A1 . 23 Figure A2 .
Figure A1.Comparative analysis of gait prediction error accuracy using regression-based and LSTM-based models under velocity variations.This figure displays box plots illustrating the RMSE (Root Mean Square Error) values for both regression-based and LSTM-based models across varying velocities (km/h) for the hip, knee, and ankle joints in the X, Y, and Z axes.Each box plot details the RMSE computed from true and estimated joint trajectories, highlighting the median, quartile ranges (Q1, Q3), and outliers.This visualization aids in evaluating the models' precision in estimating gait movements under different walking speeds.Biomimetics 2024, 9, x FOR PEER REVIEW 21 of 23

Figure A2 .
Figure A2.Comparative analysis of gait prediction error accuracy using regression-based and LSTMbased models under height variations.This figure displays box plots illustrating the RMSE (Root Mean Square Error) values for both regression-based and LSTM-based models across varying heights (cm) for the hip, knee, and ankle joints in the X, Y, and Z axes.Each box plot details the RMSE computed from true and estimated joint trajectories, highlighting the median, quartile ranges (Q1, Q3), and outliers.This visualization aids in evaluating the models' precision in estimating gait movements under different subject heights.

Figure A2 .
Figure A2.Comparative analysis of gait prediction error accuracy using regression-based and LSTM-based models under height variations.This figure displays box plots illustrating the RMSE (Root Mean Square Error) values for both regression-based and LSTM-based models across varying heights (cm) for the hip, knee, and ankle joints in the X, Y, and Z axes.Each box plot details the RMSE computed from true and estimated joint trajectories, highlighting the median, quartile ranges (Q1, Q3), and outliers.This visualization aids in evaluating the models' precision in estimating gait movements under different subject heights.

Figure A3 .
Figure A3.Comparative analysis of gait prediction correlation accuracy using regression-based and LSTM-based models under velocity variations.This figure displays box plots illustrating the correlation coefficient () values for both regression-based and LSTM-based models across varying velocities (km/h) for the hip, knee, and ankle joints in the X, Y, and Z axes.Each box plot details the correlation computed from true and estimated joint trajectories, highlighting the median, quartile ranges (Q1, Q3), and outliers.This visualization aids in evaluating the models' precision in estimating gait movements under different walking speeds.

Figure A3 . 23 Figure A4 .
Figure A3.Comparative analysis of gait prediction correlation accuracy using regression-based and LSTM-based models under velocity variations.This figure displays box plots illustrating the correlation coefficient (ρ) values for both regression-based and LSTM-based models across varying velocities (km/h) for the hip, knee, and ankle joints in the X, Y, and Z axes.Each box plot details the correlation computed from true and estimated joint trajectories, highlighting the median, quartile ranges (Q1, Q3), and outliers.This visualization aids in evaluating the models' precision in estimating gait movements under different walking speeds.Biomimetics 2024, 9, x FOR PEER REVIEW 22 of 23

Figure A4 .
Figure A4.Comparative analysis of gait prediction correlation accuracy using regression-based and LSTM-based models under velocity variations.This figure displays box plots illustrating the correlation coefficient (ρ) values for both regression-based and LSTM-based models across varying heights (cm) for the hip, knee, and ankle joints in the X, Y, and Z axes.Each box plot details the correlation computed from true and estimated joint trajectories, highlighting the median, quartile ranges (Q1, Q3), and outliers.This visualization aids in evaluating the models' precision in estimating gait movements under different subject heights.

Table 2 .
Validation results when computing the joint specific (J) and overall (O) RMSE in millimetres and correlation for the validation set.

Table 3 .
RMSE and Pearson's correlation coefficient variability for each joint axis estimation.RMSE and correlation coefficients were computed for each estimated step, RMSE step and ρ step respectively, for all axes of each joint.The average, standard deviations, quartiles and percentage of outliers are computed for each R and C to illustrate the model accuracies for each step.

Institutional Review Board Statement:
Not applicable.The processed data and results generated during the study are not publicly available but can be made available by the authors upon request.The raw data used in the study is available in Fukuchi, C.A.; Fukuchi, R.K.; Duarte, M. A Public Dataset of Overground and Treadmill Walking Kinematics and Kinetics in Healthy Individuals.PeerJ 2018, 6, e4640.https: //doi.org/10.7717/PEERJ.4640(accessed on 9 May 2024).The authors declare no conflicts of interest.

Table A1 .
Regression results for the hip key values in the x, y and z axis.

Table A2 .
Regression results for the knee key values in the x, y and z axis.

Table A3 .
Regression results for the ankle key values in the x, y and z axis.