A multivariate phase distribution and its estimation

Circular variables such as phase or orientation have received considerable attention throughout the scientific and engineering communities and have recently been quite prominent in the field of neuroscience. While many analytic techniques have used phase as an effective representation, there has been little work on techniques that capture the joint statistics of multiple phase variables. In this paper we introduce a distribution that captures empirically observed pair-wise phase relationships. Importantly, we have developed a computationally efficient and accurate technique for estimating the parameters of this distribution from data. We show that the algorithm performs well in high-dimensions (d=100), and in cases with limited data (as few as 100 samples per dimension). We also demonstrate how this technique can be applied to electrocorticography (ECoG) recordings to investigate the coupling of brain areas during different behavioral states. This distribution and estimation technique can be broadly applied to any setting that produces multiple circular variables.


Introduction
Circular variables such as phase or orientation have been used effectively for representing complex physical phenomenon and in the analysis and processing of signals. Countless physical systems are effectively represented using phase variables. Coupled oscillator systems are prevalent in classical physics as a canonical model of systems ranging from coupled pendula to coupled Josephson junctions. Oscillator models have also been effective at describing coupled behavior in nature: chemical reaction diffusion systems, heart-lung and circadian rhythms, and even the coupling of firefly luminescence can all be described with phase variables (see e.g. [1,2,3,4]). In engineering, phase has played a key role in signal representation. From classical Fourier analysis to modern techniques in image representation (for example [5,6,7]), phase provides a useful representation.
Within neuroscience, oscillatory dynamics and phase variables have had an especially interesting history. Oscillatory dynamics played a central role in many early theories of large-scale brain dynamics [8], and oscillatory dynamics have recently received widespread interest [9,10,11]. Network oscillations are hypothesized to be functionally involved in a wide range of tasks, such as representing sensory information, regulating the flow of information, binding of distributed information, and establishing and recalling memories. Clearly, phase is of central importance to the field of Neuroscience.
In many of these cases, the relationships between phase variables is of central interest and importance. For example, if we examine the pair-wise relationships of phase variables recovered from b c d a e f Figure 1: Phase dependencies in theta oscillations of human ECoG recordings. a An 8x8 ECoG grid on the surface of cortex; b recordings from 2 sites (5 and 13) filtered in the theta band (6 ± 1.2 Hz); c empirical joint phase distribution of sites 5 and 13: note the strong dependency between the phases. d The empirical distribution is highly concentrated in the difference of the phases, corresponding to phase correlations parameterized by re j∆ ik = e θ k e −θi , while the marginals, e and f, are flat. electrocorticography (ECoG) recordings we see strong statistical dependencies (see Figure 1 and section 2.1 for a more detailed discussion). The coupling of these oscillations may indicate common sources of input or task based cortico-cortical communication. In order to address these scientific questions we need analytical techniques that can capture the observed statistical dependencies.
While the need for techniques that model multidimensional phase variables may be clear, we know of no models or techniques that provide an adequate probabilistic representation or an effective model estimation technique. In the real-valued case the multivariate Gaussian distribution, and in the binary case the Ising model, serve as widely used multivariate distributions. There is no such equivalent for phase variables (see Discussion). Because of the lack of an appropriate probabilistic framework, many efforts have turned to measures of phase offset and phase correlation, which we will show are inadequate and may lead to false conclusions. Finally, the few efforts that do apply probabilistic models to phase variables restrict themselves to low dimensions, for example just 1 or 2 dimensions, or do not offer the flexibility necessary to deal with the distributions we have observed empirically.
In this paper we provide a solution to the probabilistic modeling of multivariate phase variables and the estimation of the model distribution from observations. We begin with a motivating example: we examine empirically observed phase distributions recorded from a 64-channel ECoG grid analyzed for theta band oscillations. We then introduce a multivariate phase distribution that is capable of capturing the empirically observed distributions. Next we provide an algorithm for recovering the parameters of our model based on score-matching. To demonstrate the distribution and algorithm we examine a weakly coupled oscillator system with 3 nodes. We then investigate the performance of the algorithm as a function of dimensionality of the phase space (2-100 dimensions) and the number of observations. As a final example, we demonstrate how this technique could be used to model the statistical distribution of theta phase from observations of an ECoG grid and the recovery of the network couplings under different behavioral states.

Observations from Empirical Phase Distributions
Scientists and engineers have observed dependencies among phase variables in a variety of settings [6,2,12]. Here we examine a case of direct interest to neuroscientists: phase distributions of theta-band oscillations recorded from human patients using electrocorticography (ECoG). In the experimental setup (for further details see Ref. [13]), an 8x8 electrode grid, shown in Figure 1a, is placed on the surface of the cortex of an awake behaving human patient who is receiving treatment for epilepsy. A number of papers have described interesting phenomena of these signals, especially when examining bandpass filtered responses. While multiple peaks are typical in the frequency response of ECoG recordings, here we examine the theta band, which has been implicated in cognitive processing. A common and useful technique uses the Hilbert transform to extract an amplitude and phase from theta-band filtered time series. Figure 1b displays the time series of two simultaneous theta-band filtered responses.
When we examine the empirical phase distributions we see clear dependencies among theta-band phase variables. The phase variables recorded from individual ECoG sites are uniformly distributed between −π and π, see Figure 1c. However, when we examine pairs of phase variables there are strong dependencies. In Figure 1d we display the empirical joint phase distribution of two neighboring ECoG sites (sites 5 and 13). The distribution exhibits a clear diagonal structure indicating that the probability of one phase conditioned on another is highly peaked.
This type of pairwise dependency can be described compactly using a von Mises distribution in the difference of the phase variables. The phase distribution of θ 1 − θ 2 can be written as: where κ 12 indicates the degree of concentration in the distribution, µ 12 indicates the mean phase offset between the variables, and I 0 (κ) is the modified Bessel function of zeroth order (solid line in Figure 1d). If we examine the empirical phase difference (θ 1 − θ 2 ), shown in Figure 1d, we see that this type of distribution captures the variation quite well. Importantly, we have observed similar dependencies among many neighboring ECoG sites as well as ECoG sites separated by centimeters on the cortex. In practice, we would like to know: which electrodes show strong phase dependencies, the strength of these dependencies during different behavioral tasks, and the phase offset between phase variables. While this pairwise model exhibits the appropriate structure for two variables, it does not specify the full joint distribution. In the next section we present a model that is capable of capturing the full joint distribution of all 64 electrode sites.

A Model of Multidimensional Phase Variables
Motivated by observations of empirical data, we introduce the following d-dimensional multivariate phase distribution where x is the d-dimensional complex vector with components x i = e jθi , K is the d×d-dimensional hermitian positive-definite matrix with elements K ik = κ ik e jµ ik and Z(K) is the normalization constant needed to assure that the probability integrates to one. Note that it is non-trivial to compute the normalization Z(K).
We can expand the energy function, E(θ; K) in Equation (2) to obtain a possibly more intuitive form:

Model Estimation
In order to fit the parameters of the distribution to a given data distribution, we use the scorematching method introduced by Hyvarinen [14,15]. Score-matching allows the fitting of probability distributions of the form , for real-valued data (x ∈ R n ), without computation of the normalization constant Z. The values of the parameters are obtained by minimizing a score function that contains the log-derivative of the model density but not its normalization constant. If the energy, E(x), depends linearly on the model parameters, the solution can be computed in closed form by setting the score function to zero [15]. We follow this approach to estimate the model parameters for our distribution (2). The score matching estimator of K is given by K = arg min K J SM (K) and the score function J SM (K) is given by with the expectation value, . . . , taken over the data distribution. Using the quadratic form of the energy in (2) and the Jacobian D ij := ∂x i /∂θ j , we compute The estimator K is computed by setting the derivative of the score function ∂/∂K ij J SM (K) to zero. This produces a system of linear equations: which we can solve using standard techniques.

An Example: Three Weakly Coupled Oscillators
To further illustrate the proposed model and to demonstrate how model parameters are recovered from a simulated system, we will work with a model of weakly coupled oscillators. We formally introduce a system of d oscillators with kinetic energy T = 1 2 ω T I ω = 1 2 T I −1 , where ω =θ is the vector of angular velocities, = I ω are the angular momenta, and I is a diagonal matrix of moments of inertia. If we introduce a coupling between any two oscillators by a force of the form F ij (θ i , θ j ), the system has the Hamiltonian H(θ, ) = E(θ) + T ( ). The Hamiltonian equations ∇ H =θ and ∇ θ H = −˙ are given as: The first equation is the definition of angular momenta and the second equation is the equation of motion. The force, F i = ∂/θ i H, experienced by an oscillator i can be computed from (3) as Integrating the second order equation of motion once and setting all moments of inertia to one, we obtain the following equations of motion that we use to simulate the system of coupled oscillators using different values for initial angular velocities ω 0 : This system is similar to the Kuramoto model [2], given byθ i = ω 0i + d j=1 s ij sin(θ i −θ j ), which does not have inertia.
For this specific example we set up a system of three weakly coupled oscillators in which the first oscillator is coupled to the second and the second to the third (no coupling between the first and the third). The arrangement is shown graphically in Figure 2; the direction of the arrows indicate phase angles of the bi-directional couplings). When we simulate this system, we observe distributions similar to those observed in ECoG data (Figure 1). For example, the marginal distributions in each of the phase variables is flat, and there are strong concentrations in the differences of the phase variables. Interestingly, even though there is no explicit coupling term between the first and third variables there is a strong concentration in their difference. If we use a measure of phase correlation, e θ k e −θi = r ik e j∆ ik , we see a significant correlation between the first and third phase variables (phase correlations are depicted in Figure 2b).

True Coupling
Reconstructed The estimation technique is able to recover the underlying coupling relationships from measurements of the simulated system 1 . Simulating the system in (7) we obtain 100,000 samples of the 3-d phase space. We then solve the linear system of equations given by (4) to estimate a K matrix. The terms in the estimated K matrix are depicted in Figure 2c. The estimation technique is able to recover the parameters to within 1% of their true values. Importantly, the estimated parameters indicate that there is no coupling between the first and third values (where measures of phase correlation indicate a high concentration).
After estimating the values of the K matrix we can compare the probability distribution produced by these parameters to the empirical distribution produced from the oscillating system. However, because the normalization constant in equation (2) is intractable we must use Monte Carlo meth-ods to estimate the true probability. While a number of techniques may be used, we have found that the hybrid Monte Carlo (hMC) algorithm works well for this distribution [16,17]. We used 50 leapfrog steps and adjusted the step size to achieve an acceptance rate close to 90%. Taking 10,000 samples from a hMC chain we observe that the estimated distribution closely matches the empirical distribution of the coupled oscillator system. Again, the marginal phases are flat and there are concentrations in the difference of the phase variables. Importantly, the concentration of the unimodal distributions are close as are the offsets.

Performance of Estimation Technique
We now demonstrate the consistency of our estimation technique: the ability of the technique to recover the model parameters from data. The procedure is as follows. We begin by sampling a set of model parameters K. Given these model parameters we then sample phase variables from that model using hMC. We then estimate the model parameters given the sampled data. The real and imaginary entries of the complex matrix K are sampled from a normal distribution: Re{K ij }, Im{K ij } ∼ N (0, 1) and the diagonal entries are set to zero: K ij = 0. Note that this produces a dense coupling matrix.
In the first column of Figure 3, we graphically display the element-wise amplitude and phase of a sample matrix K where d = 16. Given this matrix we sampled N = 2560 points using hMC. The recovered model parameters are shown in the second column of Figure 3.
While it is clear that these matrices are visually similar, we quantified the error using two different metrics. First we calculate the mean-squared-error of the matrix elements mse = whereK ij is the estimated model parameters. In the third column of Figure 3a we display the element-wise error before averaging. We also computed a metric indicating the quality of the recovered parameters borrowed from Ref. [18]: u is the Heaviside step function, and K max is the maximum absolute value of all matrix entries K ij andK ij . For the example in Figure 3, mse = 0.0245, and Q .95 = 0.89. We computed these error metrics over a range of dimensions and samples per dimension. The error metrics for each dimension and samples per dimension were averaged over 10 trials and are plotted in Figure 3b. The algorithm achieves highly accurate model recovery for as few as 100 samples per dimension and achieves full recovery of parameters as the number of samples per dimension reaches 1000. This indicates that the recovery of model parameters is quite feasible in many real world settings.

A Simulation: Recovery of Couplings from Artificial 8x8 Grid Data
A number of experimental techniques that measure large-scale cortical dynamics may benefit from an analysis using the multivariate phase distribution and estimation technique described in this paper. Some of the longstanding questions within neuroscience, which may be addressed using these experimental techniques, are "how are neural networks coupled?" and "how does neural coupling change during different behavioral conditions?". With these experimental questions in mind, we examine d = 64 sampled data to demonstrate how the multivariate phase distribution and the estimation of its parameters can be used to recover behavioral-state-dependent brain activity. We use a simulation rather than real ECoG recordings in order to test our methodology while knowing the ground truth.
In this simulation we used an 8x8 grid of recording sites and imposed two different coupling matrices corresponding to two distinct brain states. The goals are to 1) recover the couplings between recording sites during both brain states and 2) isolate the change in the network between the two states. We sampled a single coupling matrix, K s0 , generated from the following distribution: the µ ij terms were centered around small offsets, and the κ ij terms were sampled from a one sided Gaussian distribution with a variance determined by the distance between the recording sites. The resulting coupling matrix is presented graphically in Figure 4a. For the second simulated brain state, K s1 , we took the original brain state, K s0 and introduced 3 additional long range couplings, depicted in Figure 4b. We simulated the system using a hMC chain and took 100,000 samples from each brain state. We then estimated the coupling matrixes for each state,K s0 andK s1 . The resulting couplings are depicted in Figures 4c and 4d. For comparison we also computed the phase correlation measured under each brain state (depicted in the third column of Figure 4).
This method successfully uncovers the couplings under the different brain states and can be used to isolate the additional coupling produced by the second brain state. The recovered couplings are depicted in the second column of Figure 4. The parameter-estimation error metrics are: mse(K s0 ,K s0 ) = 0.00008, Q .95 (K s0 ,K s0 ) = 1.0, mse(K s1 ,K s1 ) = 0.0003, and Q .95 (K s1 ,K s1 ) = 1.0. By taking the difference of the K matrices we can isolate the additional coupling (shown in the third row of Figure 4). Clearly, the difference uncovers the 3 additional long range interactions that we introduced. In contrast, the measurement of phase correlation produces spurious results that might lead to false conclusions. In many instances the calculation of phase correlation produces high values when the true concentrations are not present (see Figure 2 for a simple example); taking the difference of the correlation measurements results in a dense set of connections, in which all but 3 are spurious.

Discussion
Our multivariate phase distribution and estimation technique can be compared to a number of previous efforts to characterize statistical dependencies in circular phase variables. We can break down the previous efforts by analyzing 3 major differences. First, many previous approaches only provide measurements such as means or phase correlations and do not produce a true probabilistic distribution over the phase variables. Second, previous examples of probabilistic models of phase variables only characterize univariate or bivariate distributions (see e.g. [19,20]). Models similar to ours have not been extended to dimensions beyond d = 2 [21,22]. Third, common multivariate phase distributions do not capture the distributions we have observed empirically. Most notably the von Mises-Fisher distribution only captures a unimodal phase distribution on the hyper-sphere, which does not produce unimodal concentrations in the differences of phase variables.
It is important to point out that our model does not model unidirectional interactions. Because our model only captures the instantaneous distribution, it is blind to time reversal and cannot model directed interactions. However, extending the analysis to include multiple time slices may provide an extension capable of modeling directed interactions (see [12] for a d = 2 example). Closely related to modeling directed interactions is the study of causality. Like all statistical models, the distribution and model we have introduced here does not directly address the issue of causality. However our distribution and estimation technique may be instrumental in attempts to infer causality in high dimensional phase spaces.  Figure 4: Recovery of coupling states from a simulated ECoG recording. We generated two network states (rows a and b). The second state included 3 additional couplings not present in the first state, as seen in the difference of states (row c). Taking 100,000 samples from each brain state, we used our algorithm to recover the model parameters (column 2). For comparison, the empirical phase correlations are displayed in column 3. Thickness of arrows indicates the magnitude of the couplings and directions indicate phase angles (coupling is bi-directional).

Contributions
In this paper we have introduced a new multivariate phase distribution, developed a fast and accurate technique for estimating the parameters of the distribution, and shown that our technique uniquely recovers the parameters of the distribution from observations. We have also examined the performance of the algorithm as a function of the dimensionality and the number of samples, and provided a demonstration of how this technique could be used for the recovery of neural coupling in cortex.