Functional Data Analysis for Biomechanics

The application of nonlinear tools and advanced statistical methods is becoming more prevalent in biomechanical analyses. In a traditional biomechanics laboratory with motion analysis equipment, large amounts of kinematic data can be collected relatively easily. However, a significant gap exists between all the data that are collected and the data that are actually analyzed. Because movements occur over a period of time, whether seconds or minutes, each movement is represented by a continuous series of kinematic data (e.g., 60 or 120 observations per second). Using standard analytic methods, the continuous data associated with each movement are often reduced to a single discrete number. This reduction to a single summary value, such as a peak flexion, extension, or range of motion, excludes potentially valuable information. Reducing a curve representing hip motion during gait to a single range of motion value, for example, precludes the analysis of the entire movement pattern or the timing of the movement. A handful of investigations have recognized this limitation and have begun using functions to maintain the shape and timing of the movement in the analysis. The primary purpose of this chapter is to introduce an emerging collection of statistical methods called Functional Data Analysis (FDA). FDA is distinct from traditional analytic methods because how data changes continuously over time can be assessed. Therefore, information in continuous signals can be retained, such as changes in joint angles or in landmark positions during a movement task. FDA can be used for both exploratory and hypothesis driven analyses with traditional multivariate statistical methods that have been modified for functional predictor and response variables. Although representing motion data as a set of functions is not new to biomechanics analyses (Chester & Wrigley, 2008; Deluzio & Astephen, 2007; Landry et al., 2007; Lee et al., 2009; Sadeghi et al., 2002; 2000), statistical methods developed specifically for analyzing these functions have not been available. More recently, FDAmethods have been usedwithin biomechanics to studymastication (Crane et al., 2010), back pain (Page et al., 2006), as well as age, gender, and speed effects on walking (Roislien et al., 2009). Given the interest in and need for treating motion data as functions, it is important that methods for analyzing a set a functions using emerging statistical methods are brought to the attention of those in the biomechanics community. Although several excellent references exist for Functional Data Analysis (Ramsay, 2000; Ramsay et al., 2009; Ramsay & Silverman, 2002; 2005) there are important issues for biomechanists to be aware of when implementing this set of statistical tools. Therefore, the aims of this chapter are to provide an overview of the steps associated with FDA, to focus on 4


Introduction
The application of nonlinear tools and advanced statistical methods is becoming more prevalent in biomechanical analyses.In a traditional biomechanics laboratory with motion analysis equipment, large amounts of kinematic data can be collected relatively easily.However, a significant gap exists between all the data that are collected and the data that are actually analyzed.Because movements occur over a period of time, whether seconds or minutes, each movement is represented by a continuous series of kinematic data (e.g., 60 or 120 observations per second).Using standard analytic methods, the continuous data associated with each movement are often reduced to a single discrete number.This reduction to a single summary value, such as a peak flexion, extension, or range of motion, excludes potentially valuable information.Reducing a curve representing hip motion during gait to a single range of motion value, for example, precludes the analysis of the entire movement pattern or the timing of the movement.A handful of investigations have recognized this limitation and have begun using functions to maintain the shape and timing of the movement in the analysis.The primary purpose of this chapter is to introduce an emerging collection of statistical methods called Functional Data Analysis (FDA).FDA is distinct from traditional analytic methods because how data changes continuously over time can be assessed.Therefore, information in continuous signals can be retained, such as changes in joint angles or in landmark positions during a movement task.FDA can be used for both exploratory and hypothesis driven analyses with traditional multivariate statistical methods that have been modified for functional predictor and response variables.Although representing motion data as a set of functions is not new to biomechanics analyses (Chester & Wrigley, 2008;Deluzio & Astephen, 2007;Landry et al., 2007;Lee et al., 2009;Sadeghi et al., 2002;2000), statistical methods developed specifically for analyzing these functions have not been available.More recently, FDA methods have been used within biomechanics to study mastication (Crane et al., 2010), back pain (Page et al., 2006), as well as age, gender, and speed effects on walking (Røislien et al., 2009).Given the interest in and need for treating motion data as functions, it is important that methods for analyzing a set a functions using emerging statistical methods are brought to the attention of those in the biomechanics community.Although several excellent references exist for Functional Data Analysis (Ramsay, 2000;Ramsay et al., 2009;Ramsay & Silverman, 2002;2005) there are important issues for biomechanists to be aware of when implementing this set of statistical tools.Therefore, the aims of this chapter are to provide an overview of the steps associated with FDA, to focus on issues related to aligning curves from cyclical data, to discuss specific statistical techniques, and finally to provide an example of how to implement the techniques introduced in the chapter.

Overview of basic FDA procedures
There are four primary procedures in FDA: transforming data into functions, smoothing functions, registering functions, and analysis.The purpose of this section is to provide a conceptual introduction to each of these procedures.Open source tools are available for performing the procedures described in this chapter using R statistical software (R Development Core Team, 2010) along with the freely available fda library (Ramsay et al., 2010).

Transforming data into functions
The first step in FDA is transforming the captured time series data into functions.For cyclical behaviors such as locomotion or mastication, it is common to begin by dividing a long sequence of motion data captured during walking or chewing into individual cycles (Fig. 1).
Fig. 1.Displacement data representing a time series of chewing kinematics captured using an optoelectronic motion capture system.
Each cycle is then transformed into a function.Therefore, if one cycle was represented by n observations, by transforming those data points into a function, the behavior can now be considered a single functional observation (Fig. 2).In Functional Data Analysis, functions are represented as a linear combination of a set of basis functions (see Equation 1).
One of the first decisions that must be made is the type of basis system to use to represent the data.Typical selections are Fourier and spline basis systems, but others such as polynomials or wavelets can also be used.A Fourier basis system is typically used for periodic data, while splines (in particular b-splines) are used for non-periodic data.For K basis functions, there will be K corresponding coefficients.As the number of basis functions used to model the behavior increases, complexity also increases.While increasing the number of basis functions can produce an excellent fit to the data, the complexity of the function increases computational time for certain procedures and can increase the analytical complexity and interpretation.An important advantage of transforming discrete data into functions is the ease with which derivatives can be computed and assessed, a common task in biomechanical analyses (Fig. 3).Two points should be given consideration when transforming data into functions if derivatives are going to be used.First, end point error can be amplified in derivatives (see  Section 2.2 for one possible solution to this issue).Second, wild fluctuations in the data can be amplified as a result of the basis system selected.
Fig. 3. Functions representing displacement, velocity, and acceleration for a single chewing cycle.

Smoothing
Functions can be smoothed by minimizing the number of basis functions and using regression analysis or by using a roughness penalty approach.Regression splines use a least squares 79 Functional Data Analysis for Biomechanics estimation process.The advantage of regression splines is their simplicity.However, the use of a large number K of basis functions relative to the number of data points in the curve tends to overfit the data.Conversely, using a small number of regression splines risks losing localized functional features.Further, regression splines can produce end-point error in derivative estimates.Thus, regression splines can work well when simplicity is important, but have limitations for more complex cases.An alternative to using regression splines to smooth the data is an approach that smoothes the data using a roughness penalty.With this approach it is common to have one basis function for each observation in the curve.Because this curve may contain noise, which can be especially amplified in derivatives, a measure of function complexity is penalized to impose smoothness.Unlike regression splines, the roughness penalty approach uses a large number of basis functions without overfitting or sigularity problems.Figure 4 illustrates the same curve fit with each type of smoothing.
Fig. 4. The left plot overlays raw data (dashed line) from a chewing cycle with a function (solid line) fit to the data using a regression spline.The right plot overlays the same raw data (dashed line) with a function (solid line) fit to the data using a roughness penalty approach.
For further discussion and computational details related to smoothing see Ramsay et al. (2009).Briefly, the roughness of a curve f is typically measured as the integrated squared second derivative (Equation 2).
In Equation 2, [D 2 f (t)] 2 is a measure of roughness of function f at value time t.The idea is that when a function contains noise, the noise is amplified in the second derivative.Therefore, the square of the second derivative will be large where noise is present in the curve.To create the smooth curve, a multiple of the roughness penalty is applied to the error sum of squares and a smoothing parameter λ is used to specify the degree of penalty.As λ approaches 0, the fit of the function to the observations improves.
Whether data are filtered using more traditional methods, such as digital filters, or are smoothed using regression splines or a roughness penalty approach, it is important to note that each of these smoothing techniques has the potential to impact the location of specific and 80 Theoretical Biomechanics www.intechopen.combiologically relevant points in the curve (Fig. 5).While this may seem obvious, it is important to understand the possible sources of noise introduced by the measurement technique as well as how the data smoothing method modified the curve to reduce the noise.We note that different techniques can produce different results that may affect interpretation of the behavior and analysis.This is illustrated below with a chewing cycle.Within a cycle, there are four biologically relevant phases and, therefore, three transition points between phases.The four phases, in order shown in

Registering functions -Curve alignment
An important curve that many biomechanists generate is a mean curve to represent a specific behavior.It is well known that when a curve contains both phase and amplitude variation, they can not be easily compared and an average curve does not accurately represent the true behavior (Fig. 6).Therefore, the purpose of curve alignment is to reduce phase variability while preserving the curves shape and amplitude.A common method for aligning curves used in biomechanics applications is a linear time normalization procedure.An analogous procedure exists in Function Data Analysis called registration.Curves can be registered using a continuous method or by using landmarks.Both approaches use time warping functions, which align curves to minimize phase variation.Two curves are perfectly aligned if they differ only in amplitude.If f 1 and f 2 are perfectly aligned, then the plot of f 1 (t) versus f 2 (t) will yield a straight line.Continuous registration uses an iterative numerical approach to choose time warping functions that maximize the proportionality of all the functional observations.A landmark is defined as a point that is identifiable in every curve.This may be a minima, maxima, or zero crossing.Landmark registration aligns all specified landmarks by transforming time for each curve so that the landmarks occur in the same location.A common procedure for landmark registration is to first identify the location of the landmark of interest for each individual curve.Next, an average of these landmarks is calculated.Finally, time is transformed for each curve such that the identified landmark occurs at the average location.Although landmark registration is computationally less intensive than continuous registration, there is an upfront data processing cost to identifying the location of the landmark(s) for each individual curve.

Statistical tests
An important advantage of treating the behavior as a function rather than n discrete data points is that the data do not have to be reduced to a single number to be analyzed.For example, it is common to identify peak flexion or extension or range of motion for analysis using traditional multivariate statistical methods.However, reducing the data into one or two descriptive statistics, such as an average, maxima, or minima, eliminates a large amount of valuable information that cannot be summarized with one number.In traditional multivariate statistics, high correlation among predictor variables causes unstable results.Because the points of a curve in a time series are highly correlated, putting all of the points representing the curve into a multivariate analysis could result in severe collinearity.By using FDA and representing each curve as a function, it is possible to use a functional analogue of traditional methods without the problem of collinearity.An example is provided in Section 4.

What makes a curve suitable for registration?
One advantage of FDA is that position variables, as well as their time derivatives, are all possibilities that can be considered for use in registration.However, there are characteristics associated with curves that improve registration.The presence or absence of these characteristics can suggest whether position or derivative variables should be used.The main consideration is that curves with characteristic features such as maxima, minima, or inflection points result in a set of registered curves with less phase variability compared to registering a set of curves without one or more of these features, particularly during continuous registration.
The issue of having a peak or inflection point that can be used for registration purposes is an important one.Take, for example, position data representing a chewing cycle.The chewing cycle can be defined two different ways, from maximum open to maximum open or maximum close to maximum close (Fig. 8).Defining the cycle from maximum close to maximum close produces a curve without characteristic features needed for registration.It is important to consider that position data may not always have characteristic features that are needed for registration, yet using position data is sometimes important for a particular analysis.The absence of a characteristic feature can be ameliorated in cyclical position data by two possible methods.First, one can consider dividing cycles using a different start and end event that may adjust the cycle to include a clearly defined maxima, minima, or zero crossing within the cycle.Second, a derivate of the position curves can be estimated and these derivative curves can be aligned.Once the derivative curves are aligned, the resulting time transformation information associated with each curve can be used to register the position curves.To illustrate the difference the selection of start and end events can make in the shape of the curve, Figure 8 is an example of a cycle that begins and ends with the teeth together (i.e., maximum close).Maximum opening represents the characteristics feature that can be used for registration.Therefore, a clearly defined minima is present in the displacement data.This curve can be compared to a chewing cycle that begins and ends with maximum open (see Fig. 2) Fig. 8. Example chewing cycle define from maximum close to maximum close.

Is time normalization necessary?
As previously discussed, when curve alignment is required prior to an analysis, there are three registration methods from which to select: linear time normalization, continuous registration, and landmark registration.However, an issue occurs when working with cyclical data that require time normalization prior to continuous or landmark registration.That is, a prerequisite of the FDA registration procedures is that all curves have the same time interval.Three methods can be considered for achieving this prerequisite when cycles have variable durations.
The first method is a procedure used by Page et al. (2006) for creating a common time interval across all cycles.Their procedure recommends completing the raw data with a suitable number of pairs so that the trial with the longest duration becomes the duration for all trials.While this will work for single movements, such as the sit-to-stand movement presented in the Page et al. (2006) analysis, it does not work for cyclical data.When cycles are padded at the end to create a common time interval, artifacts are created in the data that create two problems precluding the generalization of results.First, artifacts in cyclical data change the interpretation of the actual behavior.For example, when individual chewing cycles are extended so that each cycle has the same duration, chewing is characterized as having an 84 Theoretical Biomechanics www.intechopen.com unusually long phase at the end of the cycle where the teeth are together (i.e., closed gape).Figure 9 illustrates a raw, unaltered cycle, the corresponding time-normalized cycle, and the cycle that has been padded at the end using Page et al. (2006) method.The problem that occurs is that the phase at the end is artificial, since it has only been added to extend the duration of the cycle.The consequence to this solution is that statistical analyses, such as functional principal component analysis or functional analysis of variance, will provide inaccurate results because data were inappropriately added to cycles, thereby changing the behavioral pattern of the movement.Therefore, the outcome of these types of typical FDA analyses will provide inaccurate results.Thus, although completing raw data may be suitable for discrete tasks, the interpretation of the behavior will be affected and incorrect if this method is applied to cyclical data.As an alternative, padding cycles by evenly distributing missing values could be a potential solution.However, Figure 10 illustrates that the problem remains with the missing value method.Because a common time interval is required for FDA analyses, it is important to assess whether registration procedures are necessary after standard time normalization has been applied to the data.Crane et al. (2010) found that registration after time normalization may not always be necessary.They concluded that when obtaining an average curve was the purpose of registration, the additional alignment did not affect the interpretation of the behavior 85 Functional Data Analysis for Biomechanics www.intechopen.comenough to warrant the additional data manipulation.So when might registration be useful?When specific functional data analyses, such as those discussed in Section 4, will be used, Crane et al. (2010) found that decreases in inter-subject variability within homogenous groups indicate registration may be justified.Because of this, when functional data analyses are used, registration improves the likelihood of finding significant differences between two or more populations, if they indeed exist.

Specific functional data analyses
This section discusses how the linear model, one of the classic methods of data analysis, can be extended to functional biomechanical data.We illustrate the ideas in this section using a subset of data collected as part of a study by Gross et al. (In Press) that contains knee angles during a gait cycle and height normalized walking speed for 60 trials.

Classical linear model
In the classical linear model, a response variable Y is modeled as a linear combination of predictor variables X and a random distrubance term ǫ.
We can think of β j as the effect of X j on the expected value of Y. Common goals of a linear model include 1. Estimating the β j 2. Creating confidence intervals for β j 3. Testing the hypothesis H 0 : β j = 0

Functional linear model example
Figure 11 displays the knee angle throughout a gait cycle for one of the 60 trials.Time is normalized so that the start and end of the gait cycle are 0 and 1, respectively.Notice that peak flexion occurs at time ≈ 0.7.Suppose we wish to investigate whether peak flexion (denoted X) can predict a scalar outcome such as walking speed (normalized velocity) Y.The method of least-squares can be used to fit a model of the form Ŷ = β0 + β1 X, where • i = 1, ... , 60 indexes the trial • t j (j = 1, ... ,100) indexes the times during the cycle when gait is measured • f i (t j ) denotes the knee angle for trial i at time t j • β j is the effect of knee angle at time t j on Y This multiple regression model uses each angle measurement rather than just the angle at peak flexion.However, there are serious problems with this proposed model.
1.With 101 parameters (including the intercept) and only 60 trials, there is not enough data to reliably estimate a model of this complexity.
2. There would likely be a strong correlation between knee angles at nearby time points, causing instability in estimated effects.
The functional nature of knee angle is apparent in Figure 11.By considering the data as a sample of observations from a smooth underlying curve, we will be able to use all of the information without resorting to the problematic multiple regression model.Before fitting a 87 Functional Data Analysis for Biomechanics functional linear model, we must first fit a smooth curve to the data in Figure 11.We do this using b-spline basis functions.To avoid overfitting, a roughness penalty is imposed on the curvature of the fitted function.The smooth fits for 3 trials are displayed in Figure 12.
In the functional linear model, β(t) is a coefficient function whose values capture the effect of knee angle at time t on walking speed.The details of how the functional linear model is estimated are beyond the scope of this text (see Ramsay et al. (2009) for more details).The results, however, describe the power of the functional linear model.Figure 13 shows how knee angle affects Y at various points in the gait cycle.Recall that the results from the ordinary linear regression of Y versus maximum angle showed no relationship.This is consistent with Figure 13

Other functional data analyses
This section illustrates how the functional linear model reveals a more complete picture of a relationship than analysis of features (such as maxima).Our example involves a scalar outcome, but methodology exists for functional outcomes as well.Various other classic statistical methods also have functional analogues, including Analysis of Variance, Canonical Correlations, and Principal Components.

Implementation
The following code can be implemented in R (R Development Core Team, 2010) , an open source statistical software package.Once R is installed, the fda library (Ramsay et al., 2010) as well as the CraneLib will need to be installed.The fda library contains the functions that perform the procedures discussed in this chapter.The CraneLib contains the example data needed to run the following example.

Conclusions
Functional Data Analysis is an important analytical method that can be used for exploratory and hypothesis driven analyses.A primary advantage to FDA is the ability to assess continuous data that change over time without having to reduce the signal into discrete variables.However, there are some important issues for biomechanists to be aware of when implementing this set of statistical tools.Therefore, this chapter provided an overview of the steps associated with FDA, focused on issues related to aligning curves from cyclical data, discussed specific statistical analytical techniques, and finally provided an example using sample code in R.

Fig. 2 .
Fig. 2. A single chewing cycle.Dots indicate the recorded observations while the line represents the function that was fit to the observed data.
Fig. 5. From left to right are the FC-SC, maximum close, and SO-FO transitions.Dashed lines represent the location of transitions identified after regression spline smoothing and solid lines represent the location of transitions identified after roughness penalty smoothing.
The algorithm begins by aligning each individual curve in the set of curves to the group average curve.Once the new set of aligned curves is generated, a new average curve is 81 Functional Data Analysis for Biomechanics www.intechopen.comcalculated and the set of aligned curves are aligned again to the new average curve.See Ramsay & Silverman (2005) for further details.Landmark registration is in many ways a simpler form of registering a set of curves.

Fig. 6 .
Fig. 6.A set of five curves illustrating the effect of curve alignment.Without alignment the average curve does not provide an accurate representation of the behavior.Whether continuous registration or landmark registration is used, the outcomes include a new set of aligned curves as well as time warping functions.The time warping functions contain information about how phase was adjusted.A unique time warping function is associated with each individual curve.Figure7shows the warping functions associated with each of the registered curves.

Fig. 7 .
Fig. 7.This figure shows the warping functions associated with each of the registered curves.

Fig. 9 .
Fig. 9. Illustration of the outcome of creating a common time interval by completing the raw data with a suitable number of pairs.The new curve does not provide an accurate representation of the behavior.

Fig. 10 .
Fig. 10.Illustration of the outcome of creating a common time interval by padding the raw cycles by evenly distributing missing values.The new curve does not provide an accurate representation of the behavior.

Fig. 12 .
Fig. 12. Functions fit to three trials.After estimating knee angle versus time functions for all 60 trials, we can replace the discrete angle measurements f (t 1 ), ..., f (t 100 ) with a continuous angle function f (t).With this continuous function, we can write the form of the functional linear model: which shows β(t) close to 0 near the times of peak flexion (typically occurring between 0.71 and 0.73).Both the functional model and the ordinary linear model show no substantial relationship between Y and knee angle in this range of the gait cycle.The functional linear model does indicate that knee angle between .31 and .46,.55 and .64,as well as .85and .95seem to have an association with Y.When the confidence bands for the coefficient function are entirely above 0, this indicates that individuals with greater extension at these stages tend to also have an increased gait velocity Y.When the confidence bands for the coefficient are entirely below 0, this indicates that individuals with greater flexion at these stages tend to have a decreased velocity.88 Theoretical Biomechanics www.intechopen.com

Fig. 13 .
Fig. 13.Regression coefficient associated with the functional linear model.Dashed lines represent the 95% confidence interval.When this interval does not include zero there is a significant relationship between knee angle and Y.

Table 1 .
is the intercept, and 4. β1 is the estimated effect of angle on YThe results of the fitted linear model on the n = 60 trials are presented in Table1.Note that there is no significant linear relationship between peak knee flexion and Y (p = 0.73) The analysis above, however, only selects a single feature of the gait cycle (peak flexion) as a predictor of Y.An entire gait cycle contains much more information than just the maximum, and perhaps angles other than the absolute maximum can predict Y.With 100 time-normalized angle measurements on each of 60 trials, we could make use of all the information in a multiple regression by fitting a model of the form Fig. 11.Example of knee angle data from a gait cycle.Y versus peak knee flexion