Although commercial, research-grade, motion capture systems are frequently utilized in assessments of human movement due to their accuracy and reliability, these systems are expensive and nonportable (Best & Begg, 2006). Recently, repurposed gaming peripherals have gained considerable interest as cost-efficient, portable alternatives for capturing human movement. Numerous studies have examined the ability of the Microsoft Kinect™, a markerless depth camera capable of tracking full-body three-dimensional kinematics, to assess posture and gait.

Postural research has demonstrated that the Kinect accurately captures linear and angular displacements of body landmarks during stance, reaching, weight-shifting, and other functional movements in both healthy and pathological populations (Bonnechere et al., 2014; Clark et al., 2012; Galna et al., 2014; Obdrzalek et al., 2012). Additionally, the Kinect accurately measures coronal plane trunk angle (Clark et al., 2013) and upper and lower extremity range of motion (Bonnechere et al., 2014; Fernández-Baena et al., 2012). Interestingly, the Kinect is well-suited to capturing temporal aspects of both gross and fine movements, but it exhibits reduced spatial accuracy for fine movements (e.g., toe tapping or hand clasping) relative to research-grade technology (Galna et al., 2014).

Studies examining locomotion demonstrate that the Kinect captures spatiotemporal gait variables such as stride time and length, gait speed, and foot swing velocity (Clark et al., 2013; Pfister et al., 2014; Xu et al., 2015), joint angles (Pfister et al., 2014; Xu et al., 2015), and phases of the gait cycle (e.g., stance, swing, and dual support time; Xu et al., 2015) despite discrepancies identifying gait events and peak angular displacement. In summary, the Kinect has been validated for spatial and temporal variables such as the displacement and velocity of single segments and the absolute or relative orientation of body segments. Importantly, this work has defined movement contexts in which repurposed gaming technology provides comparable performance to research-grade technology.

To date, the ability of the Kinect to examine interlimb coordination, an important theoretical and clinical aspect of movement control, has not been explored. Interlimb coordination describes the spatiotemporal patterning of multiple body segments, often incorporating position and velocity information. The original studies investigating human interlimb coordination provided fundamental insights into the control of human movement utilizing collective variables, such as relative phase, to describe movement dynamics (Kelso, 1984; Kelso et al., 1986). Collective variables provide concise, low-dimensional descriptions of the macroscopic behavior of a complex system. For example, relative phase aptly characterizes the spatiotemporal coordination of two oscillating limbs. Studies examining interlimb coordination from a dynamical systems perspective have been instrumental to understanding movement control processes (Kelso, 1995). This perspective places strong emphasis on the connection between movement patterns, variability, health, and functional status (van Emmerik & van Wegen, 2002). Aspects of interlimb coordination have been utilized to discriminate various at-risk or pathological populations such as children with developmental disorders (Isenhower et al., 2012; Volman et al., 2006), athletes at risk for overuse injuries (Hamill et al, 1999), and older adults with altered motor capacities (Serrien et al., 2000). Measures of coordination are sensitive to subtle alterations to movement, in comparison to absolute or relative angular kinematics (Haddad et al, 2010), yet these measures are infrequently used in clinical domains, perhaps due to the need for expensive and complicated equipment. The Kinect system has the potential to make assessments of coordination more obtainable outside of laboratory environments given its portability and markerless capabilities.

Previous research demonstrates that the Kinect inaccurately measures movement amplitude for certain body landmarks (Bonnechere et al., 2014; Clark et al., 2012; Pfister et al., 2014), however, the structural and temporal aspects of movement appear to be preserved (Bonnechere et al., 2014; Fernández-Baena et al., 2012; Galna et al., 2014). Measures of coordination are frequently assessed in phase space using normalized position and velocity profile information, which would reduce the influence of movement amplitude errors. As such, the Kinect may be capable of accurately measuring interlimb coordination. However, interlimb coordination could prove difficult for the Kinect as moment-to-moment fluctuations in the timing or spatial characteristics of movement from a single limb can lead to discrepancies in the measured pattern of coordination.

Given the importance of interlimb coordination in the theoretical and clinical domains of movement science, our aim was to assess the efficacy of the Microsoft Kinect for capturing human bimanual coordination by adapting a classical bimanual paradigm (Kelso et al., 1986). If the Kinect is capable of assessing coordination measures such as relative phase, these measures could become more widely adopted in clinical settings as well as other naturalistic environments outside of the laboratory.

Method

Participants

Twenty-four college-aged individuals (15 females, nine males) free of neurological disorders participated in the study for course extra-credit. The study was approved by the university institutional review board, and all participants provided written informed consent.

Protocol

The study assessed the concurrent validity of oscillatory hand movements captured by the Kinect and a research-grade, marker-based motion capture system (Vicon Motion Systems Ltd., Oxford, UK). Participants performed two patterns of coordinated hand movements, in-phase (0°; simultaneous wrist flexion–extension) and antiphase (180°; opposing wrist flexion–extension), five times each (ten total trials). Condition order was randomized. Participants performed hand flexion-extension in synchrony with a metronome and were instructed to complete a full cycle of movement per inter-beep interval. On each trial the interbeep interval progressively decreased 0.1 s every 10 s. Each trial lasted 80 s consisting of eight temporal epochs. Interbeep intervals ranged from 1.0 to 0.3 s, corresponding to movement frequencies of 1.00 to 3.33 Hz. Movement frequencies were selected on the basis of values utilized in previous work (Kelso, 1984; Kelso et al., 1986).

Task instructions

Auditory cuing and task instructions are known to alter coordination dynamics, attracting movement frequency to an external pace (Byblow et al., 1994; Fink et al., 2000) and increasing coordination stability through volitional control (Lee et al., 1996), respectively. External pacing and task instructions to maintain the prescribed pattern were given in order to determine the efficacy of the Kinect for measuring both coordination patterns over a range of movement frequencies, reducing the likelihood of phase transitions and the need for additional repetitions to attain the antiphase pattern at higher movement frequencies.

Experimental apparatus

Previous research investigating the Kinect has identified a variety of user and environmental concerns that affect the quality of its skeletal data, such as ambient lighting, reflective or absorptive surfaces, color and texture variability of clothing, edge detection difficulties, device orientation, and environmental clutter (Dutta, 2012; Macknojia et al., 2012; Obdrzalek et al., 2012; Shotton et al., 2013). The present study was therefore conducted with no ambient lighting and the experimental apparatus and backdrop were covered in white sheets. Participants were required to wear short sleeves and remove any accessories from their wrists or hands. The apparatus consisted of a seat and two armrests (Fig. 1). Participants performed all conditions while seated with their hands and wrists extended beyond the front edge of the armrests. The Kinect was placed directly in front of participants at a horizontal distance and height of 1.4 m. The Kinect contains a tri-axial accelerometer that was used to align the device’s coordinate system with the gravitational line. Three Vicon cameras were placed on either side of the Kinect (Fig. 1).

Fig. 1
figure 1

Birds-eye view of the experimental setup. Participants sat in a chair located 1.4 m in front of the Kinect with forearms rested on armrests, with wrists and hands beyond the front edge of the armrests to allow freedom of motion. Participants were instructed to keep their forearms in contact with the armrests throughout the trial and to restrict movement to the wrist joint. Three Vicon cameras were placed on either side of the Kinect

Data collection and analysis

Kinematic data were simultaneously captured from the Vicon and Kinect systems. The Vicon data were sampled at 120 Hz from two 10-mm retroreflective markers placed bilaterally on the 3rd metacarpophalangeal joint, to match the skeletal landmarks of the Kinect. Synchronized audio was collected to determine the timing of the metronome beeps. Kinematic data from the Kinect were collected using a freely available application that allows access to various skeletal tracking features and output streams (Dolatabadi et al., 2013). Three-dimensional kinematics were collected utilizing the near depth range and seated kinematic model consisting of the hands, wrists, elbows, shoulders, neck, and head (Microsoft, 2013). The Kinect sampled at its native rate of approximately 30 Hz. To compensate for the low and irregular sampling rate, Kinect position data were resampled to 120 Hz using piecewise-cubic Hermite interpolation, ensuring the continuity of the first derivative.Footnote 1 Position data from both devices were filtered using a 4th-order, low-pass Butterworth filter at 8 Hz. After filtering, velocity was estimated using a centered, five-point, finite difference method, yielding errors on the order of h 4, where h is the sampling interval. Data from the two systems were time-synchronized by cross-correlating the position time series during a gross motor movement (raising the arms) performed at the beginning of each trial. All data reduction and processing were performed in MATLAB (The MathWorks Inc., Natick, MA, USA).

Movement frequency and amplitude were assessed for each cycle of wrist flexion–extension. Means and standard deviations were computed separately for each device. To determine movement periods, events corresponding to maximum wrist flexion or extension were chosen to denote the beginning and end of movement cycles. The event location varied between participants on the basis of whether the minima or maxima of each hand movement were synchronized with the metronome beep. For example, a participant timing to maximum flexion demonstrates reduced variability at the minima relative to the maxima of their movements (Fig. 2). As such, the more salient of the two events (i.e., minima or maxima) was chosen independently on each trial but was consistent across devices, hands, and epochs, to reduce event detection errors. Movement periods were calculated as the temporal difference between event n + 1 and n, corresponding to a full cycle of movement. The movement frequency was measured as the inverse of the movement period. Movement amplitude was calculated as the difference between the maximum and minimum vertical position of the hand.

Fig. 2
figure 2

Hand position time series from a single subject for the Vicon (top panels) and Kinect (bottom panels) at 1.00 and 3.33 Hz. At low movement frequencies some participants did not perform continuous movements, leading to differences in the shapes of the position trajectories at the minima and maxima. This appeared to have a much greater influence on the Kinect time series. Events (red circles) were determined independently for each trial by inspecting the Vicon time series

Continuous relative phase (CRP) and the standard deviation of CRP were calculated separately for each device (Kelso, 1995). First, the time series was segmented into eight epochs, 10 s in duration. Movement cycles were identified using the events described above. The first and last cycle of each epoch were removed to ensure steady state behavior. The position and velocity profiles of each hand were then time normalized to 100 data points, each point representing 1 % of the movement cycle. Phase plane normalization procedures are commonly employed in studies examining interlimb coordination to adjust for amplitude differences in segment range of motion and determine how data is distributed around the origin (Hamill et al., 2000). Phase planes are commonly normalized to a unit circle using the maximum and minimum values (van Emmerik & Wagenaar, 1996) or the maximum absolute value, which preserves information regarding zero velocity (Hamill et al., 2000). Normalization may be applied on a cycle-by-cycle basis or over multiple cycles. Phase portraits were constructed by normalizing the vertical position and velocity profiles of the hand movements to a unit circle on each cycle (Fig. 3). Phase angles for each hand were then calculated using Eq. 1 (Hamill et al., 2000) where the phase angle at time t in the cycle, φ (t), was calculated by the arctangent of the normalized velocity, v(t), divided by the normalized position, s(t). CRP was calculated across all points in a cycle according to Eq. 2 (Hamill et al., 2000) by subtracting the phase angle of the left hand, φ L (t), from that of the right hand, φ R (t), at time t in the cycle. CRP is a circular variable, and thus mean and standard deviation were calculated using circular statistics (Batschelet, 1981). In-phase measurements of CRP were bounded on the interval [–180, 180], such that measurements could deviate in a bidirectional fashion from the referent coordination pattern (0°), where negative or positive values imply that the left or right hand is leading, respectively. Antiphase movements were bounded on the interval [0, 360], again, such that the prescribed pattern (180°) lies in the middle of the interval. In this case, values greater than or less than 180 refer to the right or left hand leading, respectively. CRP ensemble time series were calculated by taking the means across the 100 data points for all cycles in an epoch (Fig. 3). Ensemble time series for the standard deviation of CRP were created in a similar fashion by taking the angular deviation across the 100 data points for all cycles in an epoch (Fig. 3). Mean CRP (mCRP), the average pattern of coordination, and mean standard deviation of CRP (sdCRP), the average variability of coordination, were calculated as the mean of the CRP and standard deviation of the CRP ensemble time series, respectively. Thus, coordination behavior during each epoch was described by two measures, mCRP and sdCRP, for each device.

Fig. 3
figure 3

Exemplar time series data from a participant performing the in-phase pattern at 2.50 Hz. Panels from the Kinect and Vicon data are located on the left and right, respectively. (Top row) Vertical displacement time series of the right (upper panel) and left (lower panel) hands measured by the (a) Kinect and (b) Vicon. (Middle row) Phase portraits consisting of normalized position (x-axis) and velocity (y-axis) profiles for each movement cycle of the left and right hands measured by the (c & d) Kinect and (e & f) Vicon, respectively. (Bottom row) CRP ensemble time series for each cycle of movement (solid gray lines) for the (g) Kinect and (h) Vicon, as well as means (solid black lines) and standard deviations (dashed black lines)

$$ \varphi \left({t}_i\right)={ \tan}^{-1}\left(\frac{v\left({t}_i\right)}{s\left({t}_i\right)}\right),i=1,2,\dots, 100 $$
(1)
$$ CRP\left({t}_i\right)={\varphi}_R\left({t}_i\right)-{\varphi}_L\left({t}_i\right) $$
(2)

Discrete relative phase (DRP) angles were calculated utilizing Eq. 3 below for each device (Wheat & Glazier, 2006). DRP provides a point estimate of coordination determined by the latency of two events, such as the maxima of hand displacements during oscillatory movements. DRP can be used to characterize the coordination of two oscillating segments once per cycle whereas CRP provides information about the coordination of the segments over the entire cycle. Research by Galna et al. (2014) has demonstrated that the Kinect exhibits near perfect temporal but not spatial agreement for fine motor movements. As such, DRP was included to test whether the Kinect is better suited to measures of coordination that rely exclusively on temporal information. DRP (φ) was calculated as the temporal difference between the minima or maxima of the right and left hand positions \( \left({t}_{{\mathrm{R}}_{\mathrm{event}}}-\kern1em {t}_{{\mathrm{L}}_{\mathrm{event}}}\right) \) divided by the period of oscillation of the right hand \( \left({t}_{{\mathrm{R}}_{\mathrm{event}}\left(i+1\right)}-\kern1em {t}_{{\mathrm{R}}_{\mathrm{event}}(i)}\right) \) multiplied by 360°. DRP, like CRP, is a circular variable and thus the mean (mDRP) and variability (sdDRP) were calculated using circular statistics (Batschelet, 1981). Further detail of discrete and continuous measures of coordination and coordination variability are reviewed in Wheat and Glazier (2006).

$$ \varphi =\frac{t_{{\mathrm{R}}_{\mathrm{event}}}-\kern1em {t}_{{\mathrm{L}}_{\mathrm{event}}}}{t_{{\mathrm{R}}_{\mathrm{event}}\left(i+1\right)}-\kern1em {t}_{{\mathrm{R}}_{\mathrm{event}}}}\times 360{}^{\circ} $$
(3)

Statistical analysis

To assess whether participants moved at the prescribed frequency, a three-factor Hand (2) × Pattern (2) × Frequency (8) mixed-model, repeated measures analysis of variance (ANOVA) was conducted on the Vicon measurements. Each observation was adjusted by subtracting the metronome frequency, yielding deviations from the prescribed frequency of movement. Hand, pattern, and frequency were treated as fixed factors with participant as the repeated factor. Tukey’s HSD tests with α = .01 were used to examine significant interactions as needed. Standardized effect sizes are reported using Cohen’s d with thresholds of 0.2 (small), 0.5 (moderate), and 0.8 (large) (Cohen, 1992). Device was not included as an independent variable for two reasons: (1) research-grade technology, not the device under examination, should be used to establish participants’ behavior and (2) ANOVAs are inappropriate for examining between-device agreement.

Agreement between the two devices was assessed in each condition for movement frequency, movement amplitude, mCRP, sdCRP, mDRP, and sdDRP. Bland–Altman bias and 95 % limits of agreement (LoA) provide the mean between-device difference and an interval within which 95 % of the between-device differences fall, respectively (Bland & Altman, 1999). Measurements from two methods are considered exchangeable when the LoA are narrow. Acceptable widths for the LoA are defined by the specific measurement application. Bias, \( {\overline{x}}_{\mathrm{Kinect}}-\kern0.5em {\overline{x}}_{\mathrm{Vicon}} \), and LoA, \( \left({\overline{x}}_{\mathrm{Kinect}}-\kern0.5em {\overline{x}}_{\mathrm{Vicon}}\right)\pm 2*S{D}_{\mathrm{Bias}} \), were assessed independently for each condition to identify whether agreement was influenced by coordination pattern or movement frequency. Bland–Altman plots, which display the average, \( \frac{x_{\mathrm{Kinect}} + \kern0.75em {x}_{\mathrm{Vicon}}}{2} \), versus the between-device difference, x Kinect − x Vicon, of paired measurements, provided a qualitative check for violations constant bias or homoscedasticity. However, bias and LoA provide a sufficient description when these assumptions are not violated.

Concordance correlation coefficients (r c) with 95 % confidence intervals, were used to assess absolute agreement between two devices (treated as raters) measuring the same quantity and reflect the precision and dispersion from the identity line (y = x) (Carrasco et al., 2013). r c is standardized, taking on values from –1 to 1. Values of 1 and –1 indicate perfect agreement and disagreement, respectively. Measures of absolute agreement, r c, effectively impose a penalty for relationships other than one-to-one. Pearson’s correlation coefficients (r) were used to assess relative agreement between the devices. Estimates of r tend to be skewed toward zero due to asymmetrical distributions. This impacts estimates of r and confidence intervals, particularly for samples sizes below 30 (Gorsuch & Lehmann, 2010). Fisher’s Z transformations provided unbiased estimates of r and 95 % confidence intervals. To maintain consistency with studies examining the Kinect’s accuracy and precision to capture aspects of human movement, correlation thresholds were set as poor (<.40), modest (.40–.74), and excellent (>.75; Clark, Bower, et al., 2013).

Results

A total of 1,920 epochs (eight per trial) of coordination behavior were captured. During the analysis, there were irregularities in the hand kinematics collected by the Kinect for a small proportion of trials. These irregularities occurred when the Kinect had difficulties acquiring a stable skeletal fit despite our efforts to control the environmental and user conditions. Issues with obtaining a quality skeletal fit may have been influenced by the lack of anatomical landmarks around the hands in the Kinect’s skeletal layout. As such, the discarded data are summarized below and were eliminated from the statistical analysis since including these trials in the analyses would lead to inaccuracies detecting movement cycles and an underestimation of the Kinect’s ability to measure coordination behavior.

A total of 188 of the 1,920 epochs (9.8 %) from 16 of the 24 participants (~67 %) were removed from the analysis due to kinematic irregularities. Fourteen of the 250 total trials (112 epochs) were removed in their entirety (i.e., all eight epochs), which comprised a majority of the discarded data (~60 %); three of those trials came from a single participant. Cases in which a single epoch and multiple epochs (range, two to six) were removed from a trial represented approximately 8 % and 32 % of the total discarded trials, respectively. Counts and percentages of discarded trials for each combination of pattern and frequency are displayed in Table 1. In general, kinematic issues occurred most frequently at the highest movement frequency (i.e., 3.33 Hz) and equally between in-phase (94, 50 %) and antiphase (94, 50 %) trials.

Table 1 Summary of the epochs excluded from the statistical analyses

When investigating concurrent validity with repeated measures, it is important to adjust the LoA to reflect the correlated nature of repetitions captured from the same participant. These adjusted LoA reflect prediction limits for between-device differences for a single future observation (Carstensen et al., 2008). However, due to large within- and between-subjects variability, the calculated limits were too large for future predictions. As such, mCRP, sdCRP, mDRP, and sdDRP values were averaged over repetitions to represent the coordination behavior of an individual for a particular pattern and frequency. Thus, the results presented below represent the strength of between-device agreement for behavior captured over trials.

Movement frequency

Differences between movement frequency and metronome frequency and standard deviations for each device are displayed in Fig. 4. On the basis of measurements obtained from the Vicon, participants were unable to match their movement frequency to the highest metronome frequency. A three-factor Hand (2) × Pattern (2) × Frequency (8) repeated measures ANOVA revealed a significant Pattern × Frequency interaction [F(7, 736) = 16.89, p < .0001]. Post hoc tests revealed that participants were moving faster than the metronome at intermediate frequencies for the in-phase pattern [1.25 Hz: t(0.995, 736) = 2.62, d = 0.10; 1.43 Hz: t(0.995, 736) = 3.04, d = 0.11; 1.67 Hz: t(0.995, 736) = 2.99, d = 0.11]. Movement frequencies lower than prescribed were observed at the highest frequency (3.33 Hz) for the in- and antiphase patterns [0°: t(0.995, 736) = –5.95, d = 0.22; 180°: t(0.995, 736) = –23.01, d = 0.85]. Deviations ± standard deviations from the metronome frequency at 3.33 Hz for the in- and antiphase patterns were –0.11 ± 0.25 Hz and –0.43 ± 0.30 Hz, respectively. Given that the antiphase pattern is inherently unstable at high movement frequencies, our instructions to maintain the antiphase coordination may have forced participants to reduce movement frequency to meet the task demands.

Fig. 4
figure 4

Mean differences between movement and metronome frequencies and standard deviations (error bars). Movement frequency was assessed for each hand and device independently. Negative values imply movements slower than the prescribed frequency

Bias and 95 % LoA and measures of agreement (r c and r) for movement frequency are presented in Tables 2 and 3, respectively. Overall, near-zero biases (range: –0.04 to 0.01 Hz) were observed for both patterns at all but the highest frequency (3.33 Hz). At 3.33 Hz, the Kinect underestimated movement frequency relative to the Vicon (range: –0.21 to –0.05 Hz). LoA grew as the metronome frequency increased (range: 0.04 to 1.65 Hz), suggesting that the variability of between-device differences increases as movement speed increased. The LoA were large at the highest movement frequency for in-phase coordination, due to the fact that a single participant consistently failed to increase the movement frequency from 2.50 to 3.33 Hz. Small biases and narrow LoA are reflected in the modest-to-excellent absolute and relative agreements (r c and r > .90) for the in-phase pattern at low to intermediate movement frequencies (i.e., 1.00–2.00 Hz) (Table 3). Absolute and relative agreement were reduced at the two highest movement frequencies for the in-phase pattern, particularly for the left hand (r c and r range: .20–.52). For the antiphase pattern, r c and r were greater than .90 for all but the lowest frequency for the right hand, and greater than .75 for all frequencies besides 2.00 Hz for the left hand. The observation that participants were moving more slowly than prescribed during the highest frequencies for the antiphase pattern may explain the disparity between the measures of agreement for the in- and antiphase patterns.

Table 2 Bland–Altman biases and 95 % LoAs for movement frequency (Hz)
Table 3 Absolute (r c) and relative (r) measures of agreement for movement frequency

Movement amplitude

The means ± standard deviations of movement amplitude are displayed in Fig. 5. Movement amplitude was consistently overestimated by the Kinect. Differences between the two systems decreased as movement frequency increased. Bland–Altman diagnostic plots revealed a positive, linear relationship between the bias \( \left({\overline{x}}_{\mathrm{Kinect}}-\kern0.5em {\overline{x}}_{\mathrm{Vicon}}\right) \) and the mean of paired measurements \( \left({\overline{x}}_M = \frac{{\overline{x}}_{\mathrm{Kinect}} + \kern0.75em {\overline{x}}_{\mathrm{Vicon}}}{2}\right) \) for all conditions (Fig. 6). Typically, Bland–Altman plots should not contain a trend. As such, bias was quantified by regressing the bias on \( {\overline{x}}_M \), \( \mathrm{bias}\kern0.5em =\kern0.5em {b}_0\kern0.5em +\kern0.5em {b}_1\kern0.5em {\overline{x}}_M \), for each condition. LoA were constructed by detrending the bias and determining 2SD bias. Adding and subtracting 2SD bias from b 0 produces two lines running parallel to the bias, \( {b}_0+2S{D}_{bias}+\kern0.5em {b}_1\kern0.5em {\overline{x}}_M \), which capture 95 % of the between-device differences. The bias and LoA for movement amplitude are presented in Table 4. The slope of the bias, b 1, tended to increase with movement frequency, whereas the intercept, b 0, decreased. The LoA were generally between 30 and 40 mm, but increased slightly for the left hand while performing the antiphase pattern.

Fig. 5
figure 5

Means and standard deviations (error bars) of movement amplitude for each hand and device

Fig. 6
figure 6

Bland–Altman plots for movement frequency (a) and amplitude (b) for the antiphase pattern at 1.00 Hz. No trends or violations of homoscedasticity were observed for movement frequency. A positive linear relationship between the bias \( \left({\overline{x}}_{\mathrm{Kinect}}-\kern0.5em {\overline{x}}_{\mathrm{Vicon}}\right) \) and the mean of paired measurements \( \left({\overline{x}}_M = \frac{{\overline{x}}_{\mathrm{Kinect}} + \kern0.75em {\overline{x}}_{\mathrm{Vicon}}}{2}\right) \) was observed for movement amplitudes in all conditions

Table 4 Bland–Altman bias and 95 % LoA for movement amplitude (mm)

Measures of agreement (r c and r) are presented in Table 5. Absolute agreement was poor to modest for the in-phase pattern (range: .33–.65) and modest for the antiphase pattern (range: .43–.74). r c was reduced for the left hand at low frequencies of the in-phase pattern. Relative agreement was consistently modest-to-excellent for all conditions. Values of r were consistently excellent for the right hand (range: .76–.87) and modest-to-excellent for the left hand (range: .59–.83).

Table 5 Absolute (r c) and relative (r) measures of agreement for movement amplitude

Continuous relative phase

Exemplar time series, phase portraits, and ensemble CRP curves from a single epoch are displayed in Fig. 3. Prominent between-device differences emerge between the phase portraits of the Kinect (Fig. 3c and d) and Vicon (Fig. 3e and f) due to the inclusion of velocity information. The ensemble CRP curves of the Kinect were more variable within and between cycles, although the mean curves were generally similar with small structural differences (Fig. 3g and h). Despite our best efforts to ensure the stability of the antiphase pattern at higher movement frequencies through the use of a metronome and instructions to maintain the prescribed pattern, phase transitions were observed. However, participants generally made corrections back to the intended pattern. Figure 7 displays the CRP ensemble time series of a participant performing the antiphase pattern at 2.50 Hz for the Kinect (panel a) and Vicon (panel b). During a single cycle, both devices measured a transition to the in-phase pattern and subsequent correction back to the antiphase pattern, which occurred on the timescale of 200 ms.

Fig. 7
figure 7

CRP ensemble time series from a single participant at 2.50 Hz from the (a) Kinect and (b) Vicon. Each gray line represents the CRP for a single cycle of movement. Solid and dashed black lines represent the means (solid) and standard deviations (dashed) of CRPs across cycles. Both devices measured a phase transition and correction within the same cycle (note that 0° and 360° are both in phase). Movement periods are approximately 400 ms. The transition to in-phase and the subsequent correction back to antiphase occur between 35 % and 90 % of the cycle (~220 ms), suggesting that the Kinect is capable of accurately measuring the space–time behavior of multiple limbs under certain conditions, despite its lower spatial and temporal resolutions than research-grade technology

The means and standard deviations of mCRP and sdCRP for each condition are shown in Fig. 8. Bias and 95 % LoA values are presented in Table 6. Between-device differences of mCRP were small for the in-phase (range: –2.1° to 0.8°) and antiphase (range: –4.2° to –1.8°) patterns, and increased marginally as a function of increasing movement frequency. LoA were larger for the antiphase pattern (~30°) than for the in-phase pattern (~20°). Additionally, the LoA for both patterns grew slightly as the movement frequency increased. Small biases in sdCRP were observed independent of movement frequency and pattern (range: 3.9°–7.4°), with the Kinect overestimating coordination variability, as is demonstrated in the bottom panels of Fig. 3. LoA were smaller for sdCRP than mCRP, but grew similarly as movement frequency increased and were higher for the antiphase than for the in-phase pattern.

Fig. 8
figure 8

Means and standard deviations (error bars) of mCRP and sdCRP for the in-phase (triangles) and antiphase (circles) patterns across participants for the Kinect (left panel) and Vicon (right panel)

Table 6 Bland–Altman bias and 95 % LoA for mCRP and sdCRP (°)

The concordance and Pearson correlation coefficients for mCRP and sdCRP are presented in Tables 7 and 8, respectively. Mean CRP (mCRP) showed modest-to-excellent absolute and relative agreements for the in-phase pattern. However, poor-to-modest absolute and relative agreements were found at all but the three highest movement frequencies for the antiphase pattern. The reductions in agreement for the antiphase pattern were influenced by a single participant who displayed large between-device differences (~30°). Paired measurements from this individual were more agreeable at the highest movement frequencies, thus leading to the increase in absolute and relative agreements. Time series data from this individual lacked the obvious issues seen in the discarded data; the Kinect was simply not capturing the appropriate spatiotemporal movements of the hands. Calculating the antiphase statistics excluding this individual significantly improved the estimates of absolute (range: .57–.81) and relative (range: .60–.82) agreement, reduced the bias (range: 1.0°–3.8°), and narrowed the LoA (~25°) within range of the in-phase pattern. Figures 9 and 10 show the mCRP measurements for both devices for the in-phase and antiphase patterns, respectively. Relative agreement was higher, due to departures of the data from the identity line. By contrast, the sdCRP measurements generally exhibited poor absolute agreement and poor-to-modest relative agreement. The Kinect significantly overestimated coordination variability for all conditions. The confidence limits of r c for both mCRP and sdCRP were fairly large, at times upward of .4.

Table 7 Absolute (r c) and relative (r) measures of agreement for mCRP
Table 8 Absolute (r c) and relative (r) measures of agreement for sdCRP
Fig. 9
figure 9

Mean CRP (mCRP) of the Kinect (x-axis) and Vicon (y-axis) for the in-phase (0°) pattern for each movement frequency. Each point represents the mean behavior of a participant across repetitions, measured in degrees. The solid black lines represent the line of identity (y = x). The concordance correlation coefficient (r c) assesses the precision and deviation of paired measurements from this line. The dotted lines represent the least squares linear regressions of the paired measurements. The Pearson product-moment correlation coefficient (r) assesses the strength of this relationship

Fig. 10
figure 10

Mean CRP (mCRP) of the Kinect (x-axis) and Vicon (y-axis) for the antiphase (180°) pattern for each movement frequency. Each point represents the mean behavior of a participant across repetitions, measured in degrees. The solid black lines represent the line of identity (y = x). The concordance correlation coefficient (r c) assesses the precision and deviation of paired measurements from this line. The dotted lines represent the least squares linear regressions of the paired measurements. The Pearson product-moment correlation coefficient (r) assesses the strength of this relationship

Discrete relative phase

Biases and 95 % LoA for DRP variables are presented in Table 9. For mDRP, biases were smaller for the in-phase (range: –5.1° to 2.0°) than for the antiphase (range: –8.4° to –2.2°) pattern. A trend toward reduction in the bias was observed as movement frequency increased, independent of pattern. The LoA for mDRP were between approximately 20° and 50° for the in-phase pattern, and between 35° and 60° for the antiphase pattern. As compared to mCRP, the biases observed for mDRP were slightly larger and the LoA were wider. The biases for sdDRP were of similar magnitude to those observed for sdCRP and were independent of movement frequency and pattern. Similar to CRP, the Kinect overestimates coordination variability for DRP. The LoA for sdDRP were consistent as movement frequency increased. No effect of pattern was observed.

Table 9 Bland–Altman bias and 95 % LoA for mDRP and sdDRP (°)

Concordance and Pearson correlation coefficients for mDRP and sdDRP are presented in Tables 10 and 11, respectively. Substantial variability in the measures of agreement was observed for mDRP. For the in-phase pattern, absolute agreement was poor for the five lowest frequencies (range: .03–.17), modest at 2.00 and 2.50 Hz (.48 and .73, respectively) and excellent at 3.33 Hz (.84). Conversely, for the antiphase pattern, absolute agreement was modest-to-excellent (range: .66–.76), and relative agreement was modest-to-excellent (range: .72–.84). The differences in agreement between the in- and antiphase patterns were driven by five participants. The Kinect inaccurately captured the hand kinematics at the turn-around points of the movements, leading to variable event selection, after comparing the hand trajectories to those measured by the Vicon. Similar to sdCRP, measurements of sdDRP showed poor-to-modest absolute agreement (0°: .09–.49; 180°: .03–.54) and poor-to-excellent relative agreement (0°: .21–.75; 180°: .09–.67) for the in-phase and antiphase patterns, respectively.

Table 10 Absolute (r c) and relative (r) measures of agreement for mDRP
Table 11 Absolute (r c) and relative (r) measures of agreement for sdDRP

Discussion

Our findings demonstrate that the Microsoft Kinect exhibits modest-to-excellent agreement for capturing macroscopic patterns of bimanual coordination across a wide range of movement frequencies, using both a point and a continuous estimate of relative phase. Between-device differences did emerge, however, when assessing coordination variability.

Movement frequency and amplitude

Past research demonstrates that the Kinect has difficulty tracking movements of the extremities and the spatial accuracy of fine movements (Bonnechere et al., 2014; Galna et al., 2014). Our results suggest that the Kinect inaccurately represents the amplitude oscillatory wrist movements at a variety of movement frequencies. However, between-device differences decreased as movement frequency increased. Our results are congruent with previous findings that the Kinect has difficulties detecting movements of the hands and feet (Clark, Bower, et al., 2013; Galna et al., 2014), likely due to the lack of anatomical landmarks in these areas. Despite the between-device differences in movement amplitude, movement frequency was characterized by excellent absolute agreement, similar to previous research (Galna et al., 2014). Agreement for mCRP and mDRP improved at higher frequencies. The improvement in the coordination measures likely emerged because structural differences of the hand trajectories decreased as movement frequency increased (Fig. 2). Specifically, movements became more sinusoidal at higher frequencies, as measured by both devices. Some participants failed to produce continuous movements at the lowest frequencies. Rather, they would move quickly and stop their hand to wait for the beep. This movement strategy has been found in previous research and is termed an anchoring effect (Byblow et al., 1994). At lower frequencies, hand movements were often sharp and smooth in one direction and protracted and more variable in the other (Fig. 2). The discontinuous nature of these movements impacted the Kinect’s ability to accurately measure hand movement. Future work could examine whether the Kinect is capable of accurately measuring ballistic versus slower, continuous movements. Our results suggest that when movements are slowed down the Kinect has difficulty accurately tracking the hands, however, this may not be the case for other anatomical landmarks.

Coordination measures

This study assessed coordination using CRP, which incorporates position and velocity information from two limbs, and DRP, which relies exclusively on event timing. We initially believed that DRP would exhibit a higher between-device agreement than CRP, since previous research has identified that the Kinect exhibits near perfect temporal agreement (Galna et al., 2014). Our results demonstrate that this is not the case, likely because DRP measures the time between two events using spatial criteria (i.e., the hands must be in a specific orientation) and therefore is different than simply measuring movement time.

Both the point (DRP) and continuous (CRP) estimates of relative phase demonstrated modest-to-excellent agreement for capturing the macroscopic pattern of coordination. The LoA for DRP were much wider than CRP. This may be expected because DRP contains information from a single point in time. CRP captures coordination behavior across the entire movement cycle. Thus, unlike DRP, between-device differences are not as strongly influenced by errors from a single point. The difficulty measuring DRP is likely due to the skeletal tracking algorithms used by the Kinect. In most gaming applications, subtle changes to the orientation or length of a limb has little impact on the quality of user interaction. However, for research or assessment based applications inconsistencies in the kinematic skeleton can significantly alter outcome measures, as observed in this study.

Although the Kinect was capable of characterizing the mean patterns of coordination, we found a lack of agreement when assessing coordination variability. One possible explanation for the poor agreement of coordination variability could be the fact that distributions of standard deviations are long-tailed and the values lie closer to a bound (i.e., 0°). This distribution could be problematic since r c, r, and Bland–Altman LoA assume Gaussian error distributions. In some cases, coordination variability provides greater resolution of alterations to movement patterns (e.g., Hamill et al., 1999). In the present study, we observed that the Kinect has significant difficulty measuring coordination variability. As such, researchers and clinicians should be cautious of utilizing repurposed gaming technology for clinical assessments of movement capacity without first ensuring that these technologies are capable of measuring the movements of interest.

Technical considerations

Researchers and clinicians should consider a variety of important factors when using the Kinect. First, the temporal resolution of the Kinect, which is irregular around 30 Hz is lower than research grade motion capture systems, as pointed out in previous research (Clark, Bower, et al., 2013; van Diest et al., 2014). Additionally, in our experience, software inefficiencies and suboptimal hardware configurations can slow the skeletal data stream as low as 10 Hz. In this study, the sampling rate varied from approximately 27–33 Hz. Previously observed differences in angular range of motion (Bonnechere et al., 2014; Xu et al., 2015) may be influenced by the fact that the sampling rate for the Kinect is closer to the Nyquist frequency of the signals being collected. As such, higher frequencies in the signal may fold back into the spectrum, causing aliasing (Hamill et al., 1997). Although the Kinect samples at a sufficient rate to capture the frequency characteristics of most human movements, the temporal representation of the signal may need to be reconstructed using signal processing techniques (Hamill et al., 1997). Interestingly, as commercial motion capture systems became more powerful, mathematical techniques, such as the Shannon reconstruction, have been largely forgotten in the movement sciences. With the emergence of gaming technology that samples at a lower rate, future work should examine if these reconstruction theorems improve the quality of high frequency movements captured with the Kinect in the time domain.

Second, the Kinect’s spatial resolution is much lower than that of research grade motion capture systems (van Diest et al., 2014) and decreases quickly as a function of horizontal distance from the camera plane (Menna et al., 2011). Estimations of noise at the collection distance (1.4 m) are relatively small (2–3 mm) (Nguyen et al., 2012) and should not significantly influence most gross motor movements. Calibration errors from the Vicon system used in this study were less than 0.3 mm, six-to-ten times less than Kinect. Thus, for finer movements (e.g., hand or finger movements or high frequency postural sway), the signal-to-noise ratio may be reduced for the Kinect, leading to greater between-device differences. Other investigations in the domains of posture and gait have demonstrated that the Kinect is better suited to assess gross than fine motor behaviors (Clark et al., 2012; Galna et al., 2014). These results are congruent with our findings that the Kinect has difficulty assessing more subtle alterations to human movement (i.e., variability of coordination). However, using the Kinect for measuring interlimb coordination appears promising, given the preservation of qualitative features of joint angular motion during activities like locomotion (Xu et al., 2015).

Finally, a single skeletal landmark of the hand (hand center) is tracked by the Kinect. Thus, if this landmark is occluded or inaccurately tracked, skeletal inconsistencies will emerge. The Kinect appears to have difficulty measuring coordination for certain individuals. For example, the poor agreement for mDRP at lower movement frequencies was influenced by data from a subset of participants. Data removed from the statistical analysis due to kinematic irregularities were more likely to span multiple epochs within a trial. These issues may be attributed to the instabilities of the skeletal fit. The quality of the skeletal fit (jitter, landmark placement, variable segment lengths, etc.) can vary within and between subjects and within trials. The Kinect skeletal fit was inaccurate for approximately 10 % of the kinematic records. Lack of fit resulted in hand trajectories that resembled random noise, ultimately making event detection difficult. Before each trial the researchers checked the qualitative aspects of the Kinect skeletal fit. If the kinematic chain appeared unstable or inappropriately oriented the participant was asked to step out of and return to view. Despite our best efforts to mitigate the effects of various environmental and user-related concerns, the Kinect was incapable of accurately capturing movements from a subset of participants.

Implications for future work

There are many reasons why the Kinect can be a useful tool to collect coordination data. Perhaps the biggest reason is cost. The high cost of commercial motion capture is primarily a consequence of the need for multiple-camera setups. The Kinect is significantly cheaper than commercial motion capture systems, even when it is purchased with only two to four cameras (which would cost more than $10,000 for the cameras used in this study).

Another advantage offered by repurposed gaming technology is portability. Most research investigating human motor behaviors is constrained to a laboratory setting. Devices like the Kinect provide the opportunity to examine aspects of human movement in naturalistic settings outside of laboratory environments since it is portable and does not require participants to be instrumented with markers. Research examining the postural and locomotor behaviors of older adults in community living settings using the Kinect demonstrate the utility of repurposed gaming technology for in-home monitoring applications (Stone & Skubic, 2012, 2014). Importantly, this technology could eventually prove useful for monitoring changes in functional motor capacities, reductions in sedentary behavior, or human-environment interaction following rehabilitative interventions aiming to improve mobility and balance.

The portable nature of the Kinect may also make it useful to incorporate bimanual paradigms into clinical applications. Previous work has utilized bimanual paradigms to distinguish children with developmental disorders (Isenhower et al., 2012; Volman et al., 2006). Isenhower et al. (2012) and Volman et al. (2006) observed differences in the mean coordination patterns produced by children with autism spectrum disorder (ASD) and developmental coordination disorder, relative to their typically developing peers, during rhythmic bimanual tasks. In a study examining social motor coordination in typically developing children and children with ASD, the Kinect was capable of discriminating coordination behaviors in one of three conditions comparably to research-grade technology (Romero, Amaral, Fitzpatrick, Schmidt, & Richardson, 2015). One limitation of the aforementioned study was that anatomical locations being tracked differed between devices. On the basis of the findings in this study, the Kinect could ultimately provide researchers and clinicians with a tool capable of measuring coordination behavior both in- and outside of laboratory environments. However, caution should be taken before extrapolating the results of the present study to other populations and movement contexts.

The Kinect also accurately captured phase transitions and the general structure of the ensemble CRP time series (Fig. 7). Although we only examined coordination between the hands, we presume the Kinect could be used in applications that investigate macroscopic patterns of coordination across movement cycles in other segments. For example, it is common for researchers to examine the coordination between the legs during walking and running studies (Haddad et al., 2010; Haddad et al., 2006). Because we were able to capture coordination patterns in the hands, we are confident the Kinect would better capture the coordination of larger segments, such as the shank and thigh, since the skeletal fit tends to be more stable than the most distal segments (i.e., the hands and feet).

Despite certain limitations for assessing coordination behavior, the accurate timing capabilities of the Kinect could prove useful for assessing unimanual or bimanual rhythmicity. The original use of the Kinect for pose and gesture recognition in interactive gaming may explain the device’s excellent temporal and less accurate spatial capabilities. The Kinect is unlikely to demonstrate research-grade accuracy for all movement subclasses, since it was not specifically designed to capture finer aspects of movement. Concurrent improvements in the spatial and temporal resolution and skeletal tracking may afford the possibility of mobilizing gaming technology to capture motor behaviors outside of laboratory settings.

Conclusion

In summary, the Microsoft Kinect is a useful tool for collecting macroscopic patterns of human spatiotemporal coordination across a range of movement frequencies. The Kinect does not possess concurrent validity for coordination variability. Noticeable skeletal tracking difficulties that negatively impacted the quality of the kinematic data were encountered for a subset of individuals. Continuous estimates of relative phase performed slightly better than the discrete point estimate, possibly due to the incorporation of multiple time points in the cycle. Investigation of the Kinect’s ability to capture other aspects of human coordination, including different populations (i.e., children, older adults, individuals with neuromuscular disease) and limb combinations (lower-limb, intralimb, etc.), should be pursued to further determine the Kinect’s efficacy for capturing coordination behavior in clinical domains.