Visualizing Gait Patterns of Able bodied Individuals and Transtibial Amputees with the Use of Accelerometry in Smart Phones

Human gait analysis is used to indirectly monitor the rehabilitation of patients affected by diseases or to directly monitor patients under orthotic care. Visualization of gait patterns on the instrument are used to capture the data. In this study, we created a mobile application that serves as a wireless sensor to capture movement through a smartphone accelerometer. The application was used to collect gait data from two groups (able-bodied and unilateral transtibial amputees). Standard gait activities such as walking, running and climbing, including non-movement, sitting were captured, stored and analyzed. This paper discusses different visualization techniques that can be derived from accelerometer data. Removing gravity data, accelerometer data can be transformed into distribution data using periodicity; features were derived from histograms. Decision tree analysis shows that only three significant features are necessary to classify subject activity, namely: average of minimum peak values, student t-statistics of minimum peak values and mode of maximum peak values. We found that the amputee group had a higher acceleration and a lower skewness period between peaks of accelerations than the able-bodied group.


Defining Gait
Gait is defined as a cycle of limb movements meant to safely advance the body with minimum effort; where the act of walking involves periodic movement of each foot and ground reaction forces (Baritz, Cotoros, Cristea & Rogozea 2010).A gait cycle is defined as consecutive movements of the same leading extremity during locomotion (Khan 2010).Simply put, gait can be defined as a manner or style of walking (Kishner & Monroe 2013).Between periodic movement and ground reaction forces, the periodic movement defines the cyclic nature of a human gait (Arnold 2007).One cycle consist of eight events including initial contact (0%), loading response (0-10%), midstance (10-30%), terminal stance (30-50%), pre-swing(50-60%), initial swing (60-70%), mid-swing(70-85%), and terminal swing(85-100%) (Baritz et al. 2010).However, the definitions of these terms are not universally accepted.For instance, there is no universal definition for midstance and mid-swing when describing the gait cycle thus hindering comparison between different gait studies (Gibson, Jeffery & Bakheit 2006).Therefore, other dimensions are used in defining and measuring gait.

Measuring Gait
Gait is measured to understand the way a person moves.Gait measurement tools usually comprise a detection device, similar to a treadmill, where a person performs gait activity such as walking.Such devices provide visual reports on various features of gait such as acceleration, velocity, stride length and cadence, among others.Although this equipment provides gold standards in gait detection and analysis, its cost makes access very limited (Macleod, Conway, Allan & Galen 2014).For home rehabilitation for example, it is necessary to come up with low cost alternatives.Without a gait machine, gait analysis can be done by observation using a scale (Bella, Rodrigues, Valenciano, Silva & Souza 2012).For example, five of the most common gait observation tools include: New York Medical School Orthotic Gait Analysis (NYMSOGA), the Hemiplegic Gait Analysis Form (HGAF), the Wisconsin Gait Scale (WGS), the Gait Assessment and Intervention Tool (GAIT), and the Rivermead Visual Gait Assessment (RVGA) (Ferrarello et al. 2013).In particular, GAIT is preferred over the other tools, in the context of stroke victim rehabilitation, as it has proven to be reliable, comprehensive, valid and sensitive to change (Ferrarello et al. 2013).Video data is another way of collecting spatio-temporal information for gait analysis (Okusa & Kamakura 2013).Video data can be processed using a segmentation method to single out each pedestrian without the background.Then the pattern of movement of pedestrians can be captured by observing the cycle behavior of segmented pedestrians.Video recording and a detailed observational tool for gait analysis is often used to further validate findings (Ferrarello et al. 2013).
An alternative way to measure gait is the development of simple gait analysis devices taking advantage of the affordance and affordability of wireless sensors.Wi-Gat is an example of a portable wireless gait assessment tool that measures spatio-temporal gait parameters, proving to be a possible candidate for those in search of a low cost assessment tool (Macleod et al. 2014).Such devices would be very useful in remote areas in developing countries, where monitoring pathological gait requires development of low-cost alternatives.
In this study the accelerometer in a mobile phone for is used to gather data for gait analysis.Specifically, we created a mobile and cloud application using the phone's accelerometer to collect raw gait information later sent to the cloud for data transformation, analysis, modeling and prediction.The web component provides a visualization and a report on the gait pattern.The mobile phone receives the predicted gait movement (currently able to detect walking, running, climbing and stationary movement).This paper presents gait analysis between able-bodied and transtibial patients studied in three dimensions.The first dimension is detecting gait.In gait detection, instruments are attached to the human subject, usually in the limb section, as the subject performs specific activities such as walking or stair climbing.In this dimension, the goal is to detect whether the action performed falls under the definition of gait, as opposed to simply moving the device (pretending to perform a gait motion).The next dimension is gait identification.Gait identification involves analyzing the movement and determining what kind of movement was performed, validating it against expected movement.For example, if the subject walks, gait movement should be detected and movement should be classified as walking, as opposed for example, to running.Using periodicity, raw accelerometer data was transformed to generate spatio-temporal features.A decision tree was used to classify four movements, namely: sitting, walking, climbing and running.The third dimension of gait analysis is visualization which allows physicians to monitor progress.This paper presents different visualization techniques on the patient collected gait data.

Participants
The study included a total of 26 participants, 18 abled bodies and 6 transitibial amputees.In the able body group, 8 females with age range of 22 to 45 and 10 females in an age range between 22 to 40 participated.The transtibial amputee group, was composed of 3 male participants in an age range of 18 to 50 and 5 females in an age range of 23 to 50.There was no inclusion criteria for able bodied persons.Transtibial amputation was the inclusion criteria for amputees.

Capture of Accelerometer Data
An android based application was developed to capture motion data using the built-in 3D accelerometer sensor in the phone.The unit used was an HTC One Android phone with a built in Bosch BMA250 3 -axis accelerometer Version 1 with a power of .1 mA, resolution .038SU and maximum range of 39.23 SU.Sensor Tester, a free software, was used to calibrate the accelerometer.
The phone was placed in a pouch which was attached to the calf.The application requires the following input: name of the Participant and the Type of Motion (sit, walk, run, climb).The mobile phone is placed on the participant's calf and the start button indicates the start of capture.The stop button indicates the end of data capture.
The data collected is stored in an array with the following values: a x (t), a y (t), a z (t) and the number of steps at that given time.Figure 1 shows an example of the raw acceleration data.

Gait Experiments
Gait experiments were conducted to capture motion data for two main groups of subjects, the able-bodied group and the amputee group.All subjects performed four types of activities.Figure 2 shows the list of experiments.The eight categories of subject-activities are included in the right column.Experiments were performed in covered walkways and corridors.The device is attached to the person's calf securely fastened with strong adhesive.The application is launched and person's name and the type of gait (sit, walk, run or climb) was selected.Participants were instructed to perform assigned gait for 30 seconds after which the application is set to stop. Figure 4 shows an image of the actual walking experiment.

Results
The motion data set was derived from two main groups: Motion captured from able bodied individuals and transtibial amputees.Each subject in the experiment was asked to walk, sit, climb, or run.We registered 8 categories according to the subject and the activities performed: 1. Able-bodied subject sit 2. Able-bodied subject walk 3. Able-bodied subject climb 4. Able-bodied subject run 5. Amputee subject sit 6. Amputee subject walk 7. Amputee subject climb 8. Amputee subject run.
The number of datasets (each data set consists of approximately 30 seconds acceleration values gathered at a frequency of approximately 30 Hertz) is not equal for all categories owing to availability of volunteer subjects for the experiments.Table 1 shows the number of participants per activity.

Data Transformation
The goal of the study is to detect gaits based on acceleration data.Our approach is to use the distribution of the peaks in terms of their magnitude and period between two consecutive peaks.
The data consists of 3 directions of acceleration a x (t), a y (t), a z (t).The gravity value of 9.81 m /s 2 was removed from the data to ensure that the data only comprised acceleration.The acceleration value for each point is computed as a modulus of the three acceleration directions.
Each data set was obtained at a frequency of approximately 30 Hertz (30 sample points per second).
To filter out possible noise value, the data was smoothed clean using low pass filter: The value of the smoothing parameter α = 0.1 was chosen such that the data and smoothed value peaked at almost the same time.Figure 5 shows an example of smoothed data (blue) and the actual data (red).The horizontal axis is the time stamp and the vertical axis represents acceleration.The value of the smoothed data is slightly lower than actual acceleration.Our goal is to keep the periods of the peak and the magnitude of the peak as close as possible to the original data.

Peaks and Peak Magnitude Distribution
Once the data was smoothed, the peaks of the smoothed data were found by searching the neighboring values of each data point.A point t is considered to have maximum peak value ât if the neighboring points have lower value than ât , that is ât > a t−1 > a t−2 > L > a t−h and ât > a t+1 > a t+2 > L a t−h .The window length of the neighbors h represents the strength of the peak.Similarly, we can obtain minimum peak value ȃt if the neighboring points have higher value than y p , that is ȃ < a t−1 < a t−2 < • • • < a t−h and ȃt < a t+1 < a t+2 < L < a t+h .Setting the minimum window length h = 1 will obtain any maximum and minimum peaks higher than its immediate neighbors.The left figure in Figure 6 shows a typical peak finding of the same smoothed data in Figure 5.The green circles are the identified maximum peaks and the red circles are the identified minimum peaks.Connecting only the maximum peaks and the minimum peaks the right figure in Figure 6 is obtained.
Based on peak values, we also derive the distribution of the magnitude of acceleration peaks.The distribution is separated between maximum peak values and minimum peak values.An example of peak value distribution from Figure 6 is given in Figure 7.
The distribution tends to be skewed towards the left and the values on the horizontal axes show that maximum peaks supposedly have higher values than minimum peaks.These values represent the magnitude of acceleration during the peaks.

Deriving Distribution of Period Peaks
In addition to acceleration peak magnitude distribution, we must obtain acceleration peak period distribution.
Suppose maximum peak of acceleration âp happens at time p.A maximum period is defined as the difference between two immediate maximum peaks, that is Tp = âp − âp−1 Similarly, minimum peak of acceleration ȃp happens at time p.A minimum period is defined as the difference between two immediate minimum peaks, that is Period is the union of maximum period sets and minimum period sets, that is Having the periods of acceleration peaks, we can derive the distribution of the periods.Figure 8 shows the distribution of the periods of the peaks identified in the aforementioned figures.

Derived Features
The following statistics of the histogram were used as the derived features to represent the histogram itself: Similar statistics also were used as derived features for the magnitude histogram, that is the histogram of minimum peak values and the histogram of maximum peak values.Since we have three histograms and 12 derived features, each dataset is now represented by 36 values.

Feature Selection
The purpose of feature selection analysis is to find the derived features that can be clearly used to cluster subject-activities categories.
Table 2 indicates the derived features in the columns and the last column is the subject-activity category.Each row represents one dataset of an experiment.The contents of this table stem from the derived features of the histograms as explained in the previous section.Table 3 shows the average maximum and minimum peak values.The acceleration of sitting is very low.Walking has lower acceleration than climbing and the climbing activity has a lower acceleration value than running.An amputee, in general, has higher acceleration than able-bodied subjects for all activities except sitting.Similar results were obtained for the median of maximum and minimum peak values, as shown in Table 4.The median period of walking is higher for both able-bodied and amputee individuals (Table 5).Other activities are not distinguishable in terms of median periods between peaks.
The t statistics of maximum and minimum peak values are presented in Table 6.A transtibial amputee has higher t-statistics than an able bodied individual.Amongst activities, however, the values are not distinguishable.Table 7 presents the skewness of period between peaks.It shows that the skewness of sitting is higher than walking and walking is higher than climbing; and climbing is higher than running.A transtibial amputee has a lower skewness period than an able bodied person.Table 8 shows that the average period of peak acceleration has no distinguishable values among the categories.In fact, the average period of a transtibial amputee is almost equal regardless of the activities performed longer Walking, in general, has a period between peaks than other activities.
As occurs with the average period of peak acceleration, no distinguishable values were obtained for all other derived features.The decision tree analysis, however, failed to classify amputee climbing and disabled running, probably due to lack of data (only one observation for each category).

Visualization of Features
Based on the three derived features identified by the decision tree, we visualize each observation as one point.The points are then colored based on the label category (See Figure 10).The visualization reveals possible derived features to overcome the weakness of the decision tree.Average minimum peak values higher than 8.510 m /s 2 and lower than 12.321 m /s 2 are related to class 7 which is the disabled-climbing activity and average minimum peak value higher than 12.321 m /s 2 corresponds to class 8 which is the amputee-running activity.
We revised the decision tree in Figure 9 into the following figure to accommodate all categories.
Table 9 shows the confusion matrix as a percentage to total respondents (51 persons).The strong values on the diagonal of the confusion table show that the majority of subjects are correctly classified.The off diagonal elements are the errors.The percentage of correctly classified is 92.16%.This is obtained by summing the diagonal elements of Table 9.The majority of error (6%) comes from misclassifying able-bodied-climbing as able-bodied-running.Another case of error (2%) is associated with misclassifying able-bodied-running as able-bodied-walking.
To elaborate error analysis, we take the percentage of the total predicted further, Table 10 shows the error of misclassifying able-bodied-running as ablebodied-walking is only 5% but the error of that misclassifying able-bodied-climbing as able-bodied-running is actually 43%.

Validation of Derived Features
To further validate the derived features, we run Analysis of Variance on the 8 categories.Results show that there are significant differences in the 8 categories regarding the following features: minimum Peak Values (mean, median, mode, tstatistic, minimum, maximum, range, standard deviation, skewness and kurtosis) at p = .001and maximum peak values (mean, median and mode) at p = .001.

Conclusion
"A multidisciplinary approach is a definite benefit in the diagnosis and treatment of pathologic gait.An amputation is supposed to optimize the overall function of the patient.This is accomplished through the proper prescription of a prosthetic and appropriate rehabilitation training" (Kishner & Monroe 2013).Since most studies on prosthesis use focus on the design of the device rather than how it is used (Sawers, Hahn, Kelly, Czerniecki & Kartin 2012), visualization of gait data and identification of important parameters or features for gait classification can facilitate the rehabilitation process.
In the study, we show that through visualization of the statistics of derived features we were able to revise the decision tree that originally could not produce the classification of certain categories.It is good to note that exhaustive plotting between two derived features may not reveal which features can actually be used to classify subject-activity.Only after we use the decision tree and cross tabulation to select possible features we can use the visualization.The visualization may help to reveal clear clusters of missing categories based on two or three derived features.
We can also conclude that the period between peaks (which we suspect to present the pace of activities) apparently does not reveal any significant derived features to classify subject-activity.However, individual features, namely: average of minimum peak values, mode of maximum peak values and t-statistic of minimum peak values are sufficient for predicting subject-activity, distinguishing gait between able bodied and transtibial amputee individuals.
In general, transtibial amputee have higher acceleration than able bodied subjects for all activities except sitting.Transtibial amputee also have a lower skewness period between peaks of acceleration than able bodied subjects.Such findings can further improve identification and classification of gait patterns between able bodied persons and transtibial amputees.
Further studies include validation of wireless sensor applications with standard gait analysis tools.There is also a need to increase the sample size and conduct experiments on gait measurement with location of the sensor as an independent variable.

Figure 4 :
Figure 4: Image of smartphone attached to leg or prosthesis during the walking experiment.

Figure 5 :
Figure 5: Example of raw data and smoothed data.

FrequencyFigure 8 :
Figure 8: Distribution of period of the peaks.
of T 4. N = Number of data in set { T } 5. Ratio of the Number of data in set and the number of data in acceleration 6. Student t statistics of period = Average of T / (Standard deviation of T / of T Revista Colombiana de Estadística 37 (2014) 471-488

Figure 9 :
Figure 9: Results of decision tree analysis.

Table 1 :
Number of participants per activity.

Table 2 :
Features of Subject-Activity.

Table 3 :
Average Maximum and Minimum Peak Values.

Table 4 :
Median of Maximum and Minimum Peak Values.

Table 6 :
t statistics of the Maximum and Minimum peak values.

Table 7 :
Skewness of Period between Peaks.

Table 8 :
Number of participants per activity.

Table 10 :
Confusion Matrix (Percentage to total columns).