Reliable and clinically applicable gait event classification using upper body motion in walking and trotting horses

Objectively assessing horse movement symmetry as an adjunctive to the routine lameness evaluation is on the rise with several commercially available systems on the market. Prerequisites for quantifying such symmetries include knowledge of the gait and gait events, such as hoof to ground contact patterns over consecutive strides. Extracting this information in a robust and reliable way is essential to accurately calculate many kinematic variables commonly used in the field. In this study, optical motion capture was used to measure 222 horses of various breeds, performing a total of 82 664 steps in walk and trot under different conditions, including soft, hard and treadmill surfaces as well as moving on a straight line and in circles. Features were extracted from the pelvis and withers vertical movement and from pelvic rotations. The features were then used in a quadratic discriminant analysis to classify gait and to detect if the left/ right hind limb was in contact with the ground on a step by step basis. The predictive model achieved 99.98% accuracy on the test data of 120 horses and 21 845 steps, all measured under clinical conditions. One of the benefits of the proposed method is that it does not require the use of limb kinematics making it especially suited for clinical applications where ease of use and minimal error intervention are a priority. Future research could investigate the extension of this functionality to classify other gaits and validating the use of the algorithm for inertial measurement units. 2020 The Authors. Published by Elsevier Ltd. This is an open access article under theCCBY license (http:// creativecommons.org/licenses/by/4.0/).


Introduction
Objective assessment of gait symmetry as a tool in lameness examinations is becoming increasingly popular. Commercially available tools like Lameness Locator (Equinosis) and EquiMoves by Inertia are based on inertial sensors (Bosch et al., 2018;Keegan, 2007), QHorse (Qualisys AB) is based on optical tracking of markers on the upper body of the horse . Automated systems like these perform a number of operations on the measured raw signals to make them applicable as a tool for clinical kinematic analysis. Firstly, the signal is segmented into strides detected within the locomotion pattern. A stride is a periodic event that repeats itself in the movement cycle which consists of two consecutive steps for both the fore and hind limbs. Secondly, gait events occurring during the stride are identified. In the current context a gait event is defined as the ground contact phase of an individual limb during a stride. Finally, the vertical displacement symmetry between left and right steps, as described by Buchner (Buchner et al., 1996), can be quantified. Asymmetry in the vertical displacement of head or pelvis between two consecutive steps is the standard variable to determine the severity and location of a lameness when assessing a horse visually during a trot up or when using motion sensors .
The first two steps of the analysis can be performed simultaneously using an inertial sensor (Keegan et al., 2004) or optical markers (Peham et al., 1999) placed on the limb(s). Another alternative is to use hoof mounted accelerometers, which can also classify the gait (Robilliard et al., 2007). However, there are instances where an https://doi.org/10.1016/j.jbiomech.2020.110146 0021-9290/Ó 2020 The Authors. Published by Elsevier Ltd. This is an open access article under the CC BY license (http://creativecommons.org/licenses/by/4.0/). IMU/marker on the limb/hoofs can interfere with the clinician's work, for example when performing diagnostic anaesthesia. The time consumption of preparing the horse for measurement must also be considered. One alternative to using limb kinematics to detect ground contact events is to segment strides using the vertical position minima of one tuber coxae and identifying that event as the opposite limb in stance (Walker et al., 2010) A second alternative is to use the pelvis vertical velocity minima (walk) and pelvis vertical velocity zero crossing (trot) for stride segmentation and the pelvic roll for left/right limb ground contact detection (Starke et al., 2012). Walkers preliminary approach is stated as not being completely automated and not validated for accuracy. Starke et al could show that their method correctly identified all steps in the dataset, but the study was limited by a low number of test subjects (N = 10) and a single type of surface (tarmac). Whether this method is applicable to data gathered under clinical conditions has not been validated.
Gait event detection using upper body kinematics, as done by Walker et al. (2010) and Starke et al. (2012), require prior knowledge of the horse's gait. In an ideal situation the gait is known beforehand, but in a clinical situation the horse might transition between gaits during a single measurement. An automated process for classifying the gait of individual strides would minimize the time required to either repeat the measurement or manually select the desired strides. To the authors knowledge there is no literature describing a method to achieve both gait event detection and gait classification without the use of optical markers or IMU's/sensors located on the limbs or the hoofs.
This study sets out to propose and validate a classification model that performs left/right ground contact event detection and differentiates between walk and trot using only kinematics of the upper body of the horse. Further, the proposed method is compared with the approach published by Starke et al. (2012). We hypothesize that our model will outperform Starke et al's method in terms of correctly classified steps.

Data collection
The data used in this study was a compilation of measurements gathered from research projects and clinical lameness evaluations performed at four different equine clinics; one project has previously been described by Rhodin et al. (2018). Horses of approximately 40 different breeds were included, though the majority were of European type warmbloods (supplementary table 1). Kinematic data was collected using a camera-based motion capture system (Oqus 7+ or Oqus 400, Qualisys AB, Gothenburg, Sweden) at 100-240 Hz. All horses had been fitted with reflective markers in varying setups, all sharing similarly placed markers on the pelvis and the withers, either through the use of premanufactured rubber clusters or at anatomical landmarks as described by Rhodin et al. (2018). Pelvis markers where placed on, or close to, the tuber sacrale, left and right tuber coxae and one withers marker was placed on or close to T8 (Fig. 1).
The data was split into a data set A and B. Set A was used for training and validation and set B was used to test the model. Set A included measurements all performed on an instrumented treadmill (Weishaupt et al., 2002) and comprised 102 subjects, 1 313 measurements and 60 819 steps (Table 1). Ground reaction forces were recorded at 480-512 Hz using the HP2 software (University of Zurich, Switzerland) and were synchronized with the kinematic recording at the start of each measurement using a wired trigger. In this set, 92 of the horses were measured repeatedly at different speeds, walk (1.8 ± . 2 m/s, 1.4-2.1 m/s) and trot (4.3 ± 0.8 m/s, 2.5-6.6 m/s). The remaining 10 subjects were measured repeatedly at the same speed at walk (1.7 ± 0.1 m/s) and trot (3.9 ± 0.1 m/s) and different degrees of lameness . Weight-bearing lameness was induced in each limb separately using a sole pressure model (Merkens and Schamhardt, 1988). The instrumented treadmill was used as the gold standard for determining ground contact events of each individual limb.
Set B consisted of 120 horses and 483 measurements from three equine clinics, Universiteit Utrecht Faculty of Veterinary Medicine in The Netherlands, Tierklinik Lüsche in Germany and the University Animal Hospital in Uppsala Sweden. The selection of measurements was done randomly with the minimum requirement of at least 20 steady state steps per measurement, resulting in a total of 21 845 steps. Each measurement was manually categorised by gait, surface and lunge or straight-line movement. This resulted in five categories (A-E), (A) hard surface walk, (B) hard surface trot, (C) hard surface trot on left and right circle, (D) soft surface trot and (E) soft surface trot on left and right circle. All measurements in set B were collected as a part of an orthopaedic exam (Table 2). Synchronised videos allowed visual labelling of gait and left/right ground contact events for each measurement. All horses included in this study were cared for in accordance with local ethical regulations.
All data analysis was performed in Matlab (MATLAB, 2019b, The Mathworks Inc, Natick, USA).

Stride segmentation
Measurements were segmented into strides by finding peaks in the vertical position of the pelvis (p z ) and the corresponding troughs between peaks. A segment spanning over three peaks and two troughs was defined as a full stride, consisting of a left and a right step. At trot the first maxima occur close to hoof on and around mid-stance in walk (Fig. 2). Four trajectories were extracted: withers vertical position (w z ), pelvis vertical position (p z ), pelvis roll (p r ) and pelvis yaw (p y ). Pelvis roll and yaw were derived from the vector (v) pointing from the right tuber coxae marker to the left tuber coxae marker. Roll was defined as the angle between v and the global horizontal plane (xy-plane). Yaw was defined as the angle from v to the vertical plane defined by one horizontal axis (x) and the vertical axis (z) of the global reference frame.

Time frequency analysis
The trajectories X ¼ ðw z ; p z ; p r ; p y Þ all exhibited cyclic patterns, suitable to be processed by Fourier analysis. Analysis of the frequency domain of signals in equine biomechanics is a known concept and has previously been used to extract sinusoidal component coefficients and analyse lameness (Audigié et al., 2002;Peham et al., 1996).
For the purpose of this study it was necessary to transform each stride individually. As the underlying processes of the signals were typically non-stationary and consecutive strides sometimes had different frequency content, a time-frequency analysis resembling the Short Time Fourier Transform was developed: From every stride i in the trajectories in X the fundamental frequency component and two harmonics ðw nzi ; p nzi ; p nri ; p nyi n¼1;2;3 Þ were extracted using FFT. Higher order components (n > 3) where not included as their amplitude quickly became very small and variable in the present data set. To perform an FFT on a single stride and still be able to extract these components, each stride was replicated at least 3 times, then concatenated and detrended before performing the FFT. With this approach it was possible to recreate each stride (i) of the measured trajectories X i ¼ ðw zi ; p zi ; p ri ; p yi Þ, as a Fourier series with an additional term for the slope.
where A is the offset, B 0 is the slope, B n is the amplitude, x is the stride angular frequency, u n is the phase shift, t is the time in seconds and e is the residual. This approach enabled perfect separation of the component frequencies and avoided the spectral leakage caused by windowing and/or padding normally associated with the FFT. When quantifying the phases of the components it was possible to correct the stride segmentation performed on p z . When there was an asymmetric component present in the pelvic vertical move- Table 1 Descriptive Statistics of data set A, used for training and validation. Number of measurements, steps, speed range and mean, and withers height range and mean, per gait category for 102 horses in 1313 measurements. 92 of horses were measured at different speeds and 10 were measured at their own preferred speed but with lameness artificially induced in different limbs. All measurements were done on an instrumented treadmill.

Gait
Steps  Table 2 Descriptive statistics of data set B used for testing. Number of subjects, steps, range, mean and standard deviation of withers speed and withers height, per condition (A-E) and gait for 120 horses in 483 measurements. All measurements were performed with the horses moving over ground either led or lunged by a handler. None of the horses were measured during all conditions. The conditions performed were mainly dictated by the available facilities at each clinic.  ment, the stride segmentation relative to the real stride would differ depending on if the split was done at the left or the right limb ground contact phase. To correct for this potential difference all extracted phase values where shifted to be relative to u 2pz . A step by step instruction to perform the signal decomposition can be found in the supplementary material.

Feature selection
For the gait event classification modelling, features were selected based on the following observations made in this projects data. Withers vertical range of motion was generally smaller than pelvis vertical range of motion in walk compared to at trot. At trot, pelvis and withers vertical movement were in phase, i.e. minima and maxima of both markers occurred at approximately the same time. In walk the vertical movement of withers and pelvis were out of phase of each other, i.e. a maximum in the pelvis occurred approximately at the same time as a withers' minima (Fig. 2). Thus, the ratio of the first harmonic amplitude of the withers and the pelvis, f ratio ¼ B 2wz =ðB 2pz þ B 2wz Þ, and the withers first harmonic phase, u 2wz , were chosen as features for gait classification. The roll and yaw of the pelvis exhibited a symmetrically asymmetric pattern, meaning the pelvis roll and yaw during left stance was a mirrored version of the rotations during the right stance phase, similarly described by Starke (Starke et al., 2012). According to this theory, odd component phase shifts of a left step would always be shifted 180 degrees when compared to a right step. This made the phases of the odd components of the pelvic roll (p r ) and yaw (p y ), u 1pr ; u 3pr ; u 1py ; u 3py suitable features for left/right gait event detection (Fig. 3).

Gait event classification
Quadratic discriminant analysis (QDA) was used to classify strides as one of four classes defined by gait and first hindlimb ground contact phase, walk left (wa l ) or right (wa r ) and trot left (tr l ) or right (tr r ). Before the selected features could be used as predictors, they were transformed from polar coordinates to Cartesian coordinates, resulting in two variables per component, where x ¼ w z ; p r ; p y , n ¼ 2 for w z and n ¼ 1; 3 for p r ; p y . This resulted in two predictors for the withers (f 2wzcos; f 2wzsin ), four for pelvis roll (f 1prcos; f 1prsin; f 3prcos; f 3prsin Þ and four for pelvis yaw ðf 1pycos; f 1pysin; f 3pycos; f 3pysin ). Including the withers and pelvis amplitude ratio (f ratio ), this meant a total of 11 features were used as predictors in the QDA. The model was trained and cross validated, using 5 folds, on data set A and tested on data set B. The result of the QDA prediction was compared to the result of the approach suggested by Starke et al. (2012). As Starke et al.'s method did not classify gait, the correct gait was manually added to the left/right ground contact classification.

Statistics and feature visualisation
For all directional (circular) statistics the Circular Statistics Toolbox for Matlab was used (Berens, 2015). Mean and standard deviations of the features in polar coordinates were calculated for both data sets and visualised in histograms to verify that the initial observations used to select features were valid and relevant. Fig. 3. Example of pelvis roll (p r ) and yaw (p y ) and their first (p 1r ; p 1y ) and third (p 3r ; p 3y ) sinusoidal components for one stride in walk and trot. The vertical ground reaction forces for left and right hind (pvf hl ; pvf hr ) are depicted for hoof contact reference. Sinusoidal components with an uneven number of periods are always asymmetric over one stride. This figure illustrates a left to right stride, in the case of a right to left stride the uneven sinusoidal components would be shifted 180 degrees.
In the visualisations all right steps were shifted by half a stride to simulate being left steps.
This phase resolution is important to keep in mind when interpreting the results.

Gait event classification
The QDA model correctly classified 100% of the steps in the cross validation of data set A. Classification results from data set B are shown in confusion graphs in Fig. 4. In data set B 99.98% of the steps were correctly classified, 4 out of 18 632 steps were misclassified at trot and none at walk. The Starke model correctly classified 84.6% of the steps in data set A, 217 out 17 861 steps were misclassified in walk and 9 151 out of 42 958 at trot. In data set B 80.16% of the steps were correctly classified by Starke et al's model, 89 out of 3 213 steps were misclassified in walk and 4 246 out of 18 632 steps at trot. The QDA model mean and covariance matrices can be found in the supplementary Tables 2-6.

Features
Descriptive statistics of the selected features for the two gaits and the two data sets are presented in Table 3. The estimated probability density functions of the features are illustrated in Fig. 5 and in Fig. 6. The f ratio mean ± SD in data set A/B was 0.51 ± 0.04/0.49 ± 0.05 at trot and 0.22 ± 0.07/0.25 ± 0.09 at walk. The mean ± SD difference in the phase shift between walk and trot (u 2wz ) in data set A/B was 157°± 6.8°/168°± 7.1°. Both features (f ratio ; u 2wz ) exhibited overlaps between the two gaits, meaning Table 3 Descriptive statistics of the features used to calculate the predictors for the quadratic discriminant analysis. Feature mean and standard deviation is listed per gait (walk, trot) and data set (A, B).

Feature
Set  no single feature was able to accurately discriminate between the gaits (Fig. 5). When scrutinising the pelvis rotation features, it was found that the amplitude of the first component of roll (B 1pr ) was the biggest one in both gaits where the meanAESD in data set A/B was 5.1°± 1. 2°/3.9°± 1.3°in walk and 2.3°± 1.0°/2.6°± 1.3°at trot. All rotation mean amplitudes were greater in walk compared to trot for both data sets and the standard deviations in set B were greater than or equal to those in set A. This is not surprising as the data from set B were acquired under a variety of different conditions in a clinical setting.
The phases of the pelvis rotational components generally exhibited smaller standard deviations in walk compared to at trot. At trot, the standard deviation of u 3pr was notably smaller than in the other components, 16°in set A and 30°in set B. It was interesting that a component which was relatively small and hard to distinguish in the raw rotation of the pelvis was relatively the least variable one (Table 3). The estimated probability distributions of  the pelvis rotational phases (Fig. 6) all showed that they were concentrated around their means. Still, no single phase was sufficient to correctly classify all steps as left or right. One notable finding was that 8 horses had mean values of u 1pr approximately 180 degrees shifted from the mean of data set A, effectively rolling their pelvis in the opposite direction of what is shown in Fig. 3.

Gait event classification
Despite the variation in gait, speed, surface, breed and size of the horses the QDA model trained on dataset A correctly classified all but four steps of the test dataset (B). The study was limited due to the model only classifying two gaits (walk, trot) and being developed mainly on warmblood breeds why generalizability could be questioned. Anatomical variation of horses in data set B still indicate robustness of the model, although the sample size of non-warmbloods was relatively small. There also exist other breeds that were not tested and for which validity could not be claimed. Another drawback was that no horses in data set B were walked on a soft surface or on a circle. However, given the consistency and distinctness of the predictor variables in walk it seems likely the model predictions would work on soft surfaces and circles as well. The relationship between the selected features and lameness-related movement asymmetry was not investigated in this study. However, because of the inclusion of subjects with induced lameness in data set A, and the subjects in data set B being clinical cases, we are confident that only extreme movement asymmetries might affect the outcome of the gait event classification model.

Features
It is clear from the results that the odd rotational components of the pelvis are more pronounced and less variable in walk than at trot. This might be because the 2-3 limb support phases in walk physically limit how the pelvis can be rotated compared to trot with its 2-limb support phase and airborne phase.
When studying the pelvic roll components (p 1r ; p 3r ), it is intuitive to think of p 1r as the rotation caused by the horse raising its hip when the contralateral limb is pushing of the ground (see Fig. 3). Interpreting p 3r is more difficult, but we speculate it might be related to kinetic events such as hoof contact and peak vertical force, and/or an effect of muscle contractions. One interesting find was the fact that several horses had a mean u 1pr almost opposite of the mean of the population at trot. Instead of raising the contralateral tuber coxae during push off it was lowered. In addition to this, some subjects had an amplitude of p 1r which was smaller than that of p 3r , resulting in a total roll without a clearly distinguishable minima and maxima for left and right steps respectively. This relatively large variation in amplitude and phase of the two roll components might partly explain why the Starke model did not perform as well as the QDA at trot.

Time frequency analysis
When studying pelvic rotation and withers vertical movement it becomes apparent that, because of the complex signal patterns more traditional feature extraction methods, such as min/max values and range of motion, are not always sufficient to reliably detect left/right ground contact events and gait. Because of the cyclic nature of the equine gait and the need to extract information on a stride by stride basis, time-frequency analysis is a good methodological approach to solve the problem. The main motivation for using the segmented FFT analysis proposed in this paper, over for example a Short Time Fourier Transform (STFT) or a Wavelet transform (WT) is because of its robustness in extracting a predetermined number of features per stride, even for a single stride. Another benefit is that it does not require any pre-processing of the signal. Performing STFT or WT on a signal with missing data points, which is not an uncommon situation when working with optical motion capture data, requires the gaps to be filled beforehand.

Conclusions
In conclusion, a new model for classifying hind limb left/right ground contact events and gait (walk, trot) in horses, based solely on upper body kinematics have been presented and is more accurate than the previous models described. The initial observations used to select features were confirmed to be valid and relevant. The QDA model could reliably predict both gait and footfall sequence given the extracted features in a variety of settings commonly encountered in clinics. Pelvic rotations -especially rollvary considerably between subjects and/or experimental conditions. The use of time-frequency analysis can be beneficial to extracting discrete variables from complex movements for use in discriminant analysis. The method requires a very limited marker setup with variables that could be extracted from sensors as well. Further research is needed to quantify the specific effects of surface, type, speed and movement asymmetry on the components of pelvic rotation. Extending the model to include other common gaits would be beneficial, but more kinematic data would be required for this endeavour. Furthermore, the community would benefit from adapting and validating the method to make it applicable to inertial measurement units.

Declaration of Competing Interest
The salary of Christoffer Roepstorff was partially funded by Qualisys AB. However, Qualisys AB had no influence on the outcome of this study. No other authors declare any conflicts of interest.