Phase Locking Value revisited: teaching new tricks to an old dog

When the concept of functional connectivity was first introduced in 1994, the study of brain synchronization seemed to be an unachievable goal. Yet, the advances in computer science and the increase in calculation power have allowed for the application of connectivity metrics to brain activity, first in sensor level and, over the last years, in source space. However, the study of source space synchronization is still a tedious task, as the high computational cost of the algorithm escalates with the square of the number of signals evaluated. In this work, we take the original implementation of Phase Locking Value (PLV) and re-formulate it in a highly computational efficient way. This new implementation avoids some highly CPU-expensive operations, achieving a 100-fold speedup over the original algorithm. This speedup makes possible to calculate whole-brain connectivity in source space in reasonable times. In addition, this reformulation presents a strong symmetry with coherence, which allows the direct comparison of both metrics. In our simulations, both metrics have a similar behavior. Nonetheless, PLV proves best to evaluate oscillators over a band, whereas coherence shows similar behavior when the synchronization occurs at only one frequency. We used this symmetry to introduce two new derived metrics insensitive to zero lag synchronization, the imaginary part of PLV (iPLV) and corrected imaginary part of PLV (ciPLV). Our simulations prove that these metrics, especially ciPLV, are highly robust in the presence of volume conduction. ciPLV, in particular, proves capable of ignoring zero-lag connectivity, while correctly estimating nonzero-lag connectivity. These results suggest that this metric is ideal to measure synchronization in the presence of volume conduction.

In this work, we take the original implementation of Phase Locking Value (PLV) and reformulate it in a highly computational efficient way. This new implementation avoids some highly CPU-expensive operations, achieving a 100-fold speedup over the original algorithm. This speedup makes possible to calculate whole-brain connectivity in source space in reasonable times.
In addition, this reformulation presents a strong symmetry with coherence, which allows the direct comparison of both metrics. In our simulations, both metrics have a similar behavior. Nonetheless, PLV proves best to evaluate oscillators over a band, whereas coherence shows similar behavior when the synchronization occurs at only one frequency.

Introduction
The study of the brain caught the interest of researchers before neuroscience was even a discipline. For a long time, the brain was considered an organ, or rather a set of organs, with distinct parts taking care of distinct duties, following the so-called phrenological point of view. However, in recent years some have begun to departure from this traditional approach in which distinct parts of the brain are believed to have different functions. Instead, they argue than cognition, thought and action are supported by the collective action of sets of brain areas (networks) [1]. Within this paradigm, the brain areas form a network (the connectome), entwined by white matter tracts (anatomical connectome), in which different sub-networks communicate dynamically with each other to perform different functions (functional connectome).
Whereas the anatomical connectome is assessed from diffusion tensor imaging (DTI) [2], the functional connectome is constructed from neuroimaging techniques such as functional magnetic resonance imaging (fMRI) and electro-and magneto-physiology [3], [4]. In the latter case, however, the description of the connectome is not straightforward, as the underlying mechanics of the brain are unknown. Instead, we have to trust on measures of brain activity at different recording sites. In this framework, functional connectivity (FC) is defined as the existence of statistical dependence between the activities in two or more sites above chance level, a dependence that can be evaluated in different ways.
In the case of signals such as M/EEG, one of the most studied connectivity hypothesis is that of phase synchronization (PS) [5]- [8]. Under this scenario, two brain areas show the activity of different oscillators and, in the case of connectivity between different regions, their oscillation properties (i.e. phase or frequency) should be related. If this relation can be mathematically evaluated, we get an estimator of phase connectivity.
Of the many PS measures available in the literature, one of the most used, mainly due to its simplicity, is the Phase Locking Value (PLV, [5]), sometimes called Mean Phase Coherence (MPC) [9]. This measure evaluates the instantaneous phase of the signals under the hypothesis that connected areas generate signals whose instantaneous phases evolve together. In this case, the phase of the signals is said to be "locked" and the difference of phases must be constant. However, real-world signals are inherently noisy, and it is not always possible to be sure that the evaluated signal only comes from one oscillator.
The problem is solved by allowing some deviation from the condition of constant phase difference. Thus, PLV evaluates the spread of the distribution of phase differences, and the connectivity estimation is linked to this spread. The narrower the distribution of the phase difference, the higher the PLV value, always ranging between zero (no phase dependence) to one (complete phase dependence). We will elaborate on the mathematics behind the estimation of the PLV in the next section. Here we briefly describe the main steps involved.
The most common method to calculate PLV is based on the instantaneous phase of the signals obtained using the Hilbert or the wavelet transforms. In both cases, the calculations are fast and reliable, which has undoubtedly contributed to its use. The calculation of the phase difference and its spread is also mathematically simple, and, in principle, not very timeconsuming. However, even when the computation of PLV is fast and easy for a fair amount of data, the computational cost grows with the square of the number of signals. This is especially important in the case of M/EEG, distributed source-space analysis, where such number ranges from the thousands to the hundreds of thousands.
It is also noteworthy that, despite the estimation of the PLV and other related indices from data has a long history, from time to time new studies appear in the literature [10], [11] in which different aspects of the calculation procedures and the indices themselves are analyzed with a fresh eye to refine the existing methodologies. This paper intends to be one of such studies to shed new light on PS estimation, interpretation and use. Thus, we start from the original PLV formulation of Lachaux [5] and rewrite it to obtain an equivalent expression that is far easier to compute, reducing the required time for the PLV calculation up to a factor of 100. Besides, we also show that this new formulation is closely related to that of coherency, thereby allowing an interpretation of this well-studied function in terms of the PLV (see also [11]). As an additional advantage, such interpretation allows the formulation of two PLV-based zero-lag-insensitive measures: the imaginary PLV (iPLV) and the corrected iPLV (ciPLV), which can be used as alternatives to the imaginary part of the coherency [12] and its corrected version [10], respectively, in the assessment of direct PS from M/EEG data.

Computational optimization
In their original paper, Lachaux [5] defined the PLV as a time-dependent connectivity measured tailored to study evoked activity. The idea behind their definition is that the stimulus resets the phase of the neural oscillators so that signals connected in a given time should have a stable phase-difference along trials. Its mathematical formulation reads: where is the number of trials and ( , ) is the instantaneous phase for signal in trial at time .
This definition can be extended to resting-state data, by assessing phase locking as a stable phase-difference over time, thereby obtain the so-called MPC 1 [9]: where is the data length.
In either case, one has to extract the instantaneous phase ( ) of each signal. Besides, for the phase to be physically meaningful, it is necessary that only one oscillator is present in each signal. This is achieved, e.g., by means of a narrow-band pass filtering or, equivalently, the convolution with a narrow band complex wavelet such as that of Morlet [13].
After the filtering process, we obtain a band-pass version of the Hilbert analytical signal: where ̃ represent the Hilbert transform of , and BP stands for band pass. The instantaneous phase is the angle between the real and the imaginary parts of the Hilbert analytical signal, or the angle between the original (band-pass) signal and its Hilbert transform.
The instantaneous phase is usually extracted from this analytical signal, the phase difference estimated and, finally, the exponentiation calculated to get the unit phase difference vector. However, these two operations (phase extraction and exponentiation) are computationally expensive, but, as we will show, they can be easily circumvented by using the properties of exponentials.
First, let us obtain the oscillatory part of the analytical signal by normalizing (3) From this, we easily derive the exponential of the phase difference: where (·) * represents complex conjugate. Thus, expression (2) can be rewritten as: or, using vector algebra: where ̇ is a vector version of ̇, , ( ) and ̇, its transpose conjugate of . This calculation is computationally very efficient, and allows a considerable speed up of the estimation with low memory penalization. The calculation efficiency is discussed in the Results section.
The efficient formulation can be extended to Lachaux's PLVi,j(t) by constructing the vectors with the -th sample of each trial:

Relation to coherence
If we expand (1) using the oscillatory part of , we get: Phase synchronization only makes sense for signals composed of a single oscillator. Yet, it is mathematically possible to calculate the PLV from both a broadband signal and an arbitrarily narrow band one, even when these calculations do not have, in principle, physical sense.
If we take the extreme case of a single oscillator whose spectrum is non-zero only at a given frequency, , the signal can be written as an out-of-phase cosine, where the phase at the initial time is equal to the Fourier phase, and the phase at any other time is determined by the delay and the frequency.
where ( ) is the Fourier transform of ( ) at and ℜ{ } stands for the real part of . In this case, Lachaux's definition of PLV is no longer time dependent, and its formulation can be simplified as: This formula closely resembles that of coherence [14]. In fact, coherence can be rewritten as: With this formulation, it is clear that coherence is a weighted average of the unit phase vectors, i.e., a version of PLV weighted by the joint amplitude of the signals at a given frequency. Alternatively, coherence can be a regarded as a version of PLV weighted by the signal-to-noise (SNR) ratio of each trial, because if the environmental conditions (noise) are stable, the amplitude of a signal is proportional to its SNR. In any case, coherence and PLV are tightly related and both are different formulations of the same principle.
It can be argued that PLV and coherence are inherently different, as the PLV must be calculated over a whole oscillator (i.e. a frequency band) and coherence is calculated independently for each frequency. However, following the previous logic, it is trivial to understand that PLV in a given band is related to the average coherence in the frequency range encompassed by the band weighted by their relative amplitude. We will further elaborate on this relationship in the Results section.

Zero-lag-insensitivity versions
Despite its popularity, the PLV presents an important limitation when used to assess functional brain connectivity: its sensitivity to volume conduction [15] and source-leakage effects. In fact, in the case of M/EEG data at the sensor level, several sensors can simultaneously pick up the activity from the same source (volume conduction). In turn, at the source level, due to the low spatial resolution of the data, different neighboring sources may share some activity (source leakage).
Luckily, due to the low capacitance of the tissues of the head for the physiological frequencies and the small distance that the currents have to travel, the propagation of the signals of interest can be considered instantaneous [15], [16]. Under this assumption, volume conduction/source leakage occurs with zero-lag propagation. In other words, the phase difference of the part of the signals related to such spurious connectivity must be zero.
Following this logic, Cornelis Stam [15] and Guido Nolte [16] came with two different phase connectivity metrics that discard zero-lag connectivity and are, thus, insensitive to volume conduction. These metrics are the phase lag index (PLI) and the imaginary part of coherency, respectively. Due to the tight relation of PLV with coherency, it is possible to extend the imaginary part of coherency to PLV to obtain a PLV-based measure insensitive to volume conduction effects, which, for symmetry, we will term imaginary PLV (iPLV): where ℑ{ } stands for the imaginary part of . This measure is insensitive to zero-lag effects, as it removes the contribution of the zero phase differences that, due to the complex exponentiation, gives real PLV values. However, (13) is not normalized, as its upper bound, corresponding to two signals with a phase difference >0, is sin( ). This can be corrected analogously to what [10] did for the imaginary part of coherency, to define a corrected imaginary PLV (ciPLV):

Speedup achieved using the proposed algorithm
In order to study the behavior of the proposed formulation of the PLV algorithm, we calculated PLV using three different implementations for different data lengths and number of signals. Even when the computation is completely deterministic, and any data set with the same characteristics would produce the same results, for the sake of fidelity, we used real source space data.
Test data consisted of five minutes of eyes-closed resting-state magnetoencephalographic (MEG) activity acquired using a 306-sensors Elekta Vectorview system (Elekta, AB, Stockholm, Sweden), located inside a magnetically shielded room (VacuumSchmelze GmbH, Hanau, Germany) at the "Laboratory for Cognitive and Computational Neuroscience" (Madrid, Spain). Data was acquired with a sampling rate of 1000 Hz (anti-alias band-pass filter of 0.1 to 330 Hz) and filtered using a spatiotemporal signal space separation method [17].
Data were segmented in 20 or 40 four-second artifact-free segments and band-pass filtered in the classical alpha band (8 to 12 Hz) using a 2000th order FIR filter in two passes with 2 seconds of real data as padding on both sides. The instantaneous phase of the signal was determined using Hilbert's analytical signal. In order to avoid edge effects, the analytical signal was calculated prior to the removal of the padding. In order to get usable timescales for the original implementations, data was downsampled by a factor of ten.
We then placed 2459 dipoles inside the head of the subject, in a 1 cm homogeneous three-dimensional grid. Source space data was calculated using a realistic single shell as a forward model [18] and a beamformer as inverse method [19]. The obtained spatial filter was applied to the sensor space data to obtain up to 2459 time series in source space. Last, all-toall PLV connectivity matrix was calculated for different numbers of sources, ranging from 500 to 2459. PLV calculation was performed using Matlab (The Mathworks Inc., Natick, Massachusetts) and the three functions whose codes are in the Appendix. Time and memory performances were measured using Matlab's tic/toc and FieldTrip's [20] memtic/memtoc functions, respectively. === Figure 1 === Figure 1 shows the results of the performance test using 40 trials and different numbers of signals. Every configuration was evaluated 5 times, and the figure shows the mean and the standard deviation. The original implementation is the slowest, but the most memoryefficient of the three evaluated algorithms. The optimized implementation achieves almost a 3-fold speedup with a slight increase in memory consumption. However, the proposed algorithm is able to get a 100-fold speedup over the optimized implementation, with only a marginal (especially at a high number of signals) increase in memory use. Results using 20 trials of data are not shown here, but the behavior was similar, with approximately half the time and memory consumption.
These results show that the proposed algorithm can achieve a dramatic speed-up in the calculation of the PLV values. The algorithm even allows the calculation of PLV from nondownsampled data, calculating the full connectivity matrix of 2459 time-series in an average of 113 seconds, using an average of around 8 GB of RAM memory, over 100 executions. Extrapolating the results shown in the previous paragraph, this execution, even with the optimized implementation, would take 3 hours.

Comparison of PLV and coherence
As noted in the Methods section, both PLV and coherence are phase-synchronization estimation algorithms. Both metrics measure similar properties of the data, but while coherence gives one result per frequency, PLV gives one result per frequency band. This difference arises from the fact that coherence estimates phase synchronization from Fourier's phase, whereas PLV estimates it from Hilbert's phase. In the extreme case of an infinitely narrow band, both phase definitions converge, but in a normal case where the band is finite, the results of coherence and PLV would diverge.
In order to evaluate the behavior of the coherence and the PLV in signals with a known degree of coupling, we used a pair of coupled chaotic systems: a Rössler system and a Lorenz one [21]. In this setup, the Rössler system acts as driver, and an oscillation frequency can be defined from it. The slave Lorenz system is driven by the Rössler to an extent determined by the coupling parameter C, ranging from zero (completely independent systems) to one. The mathematical definition of the coupled systems is: { 1 = − · ( 1 + 1 ) 1 = · ( 1 + 0.2 · 1 ) 1 = · (0.2 + 1 · 1 − 5.7 · 1 ) { 2 = 10 · ( 2 − 2 ) 2 = 28 · ( 2 − 2 − 2 * 2 + · 1 2 ) 2 = 2 · 2 − 8 3 ⁄ · 2 (15) where subscripts 1 and 2 refer to the Rössler and the Lorentz system, respectively, is a scaling parameter determining the fundamental frequency of the Rössler oscillator and is the coupling parameter. The scaling parameter was set to 10, establishing the oscillatory frequency at 0.585·π radians per sample, and the oscillatory band was set to 0.570·π to 0.600·π radians per sample. With this setup, we generated 50 pairs of signals of 20,000 samples for different values of C ranging between zero and one. PS between the systems was finally estimated using the first variable of each chaotic system .

=== Figure 2 ===
In order to get a band-wise coherence value, we adopted two different approaches using either the maximum or the mean value of coherence over the evaluated band. The values of PLV and both coherence approaches are shown in Figure 2a. Clearly, both PLV and coherence evolve closely with increasing coupling, with coherence values slightly overestimated for low couplings. This overestimation is due to the lower number of phases used for the coherence calculation as compared to the PLV. Indeed, for the calculation of coherence data was split into 400 samples segments with 200 samples of overlapping, giving a total of 99 segments, and 99 independent phases. For the calculation of PLV, with a bandwidth of 0.030·π radians per sample, there is three effective sample for every 100 samples, giving a total of 600 effective samples, and thus 600 independent phases.
In order to address this problem, Figure 2b shows the results using a smaller window for the coherence calculation. In this case, with 100 samples windows and 50 samples overlap, coherence is calculated using 399 phases. The overestimation for low coupling is clearly lower than in the previous case, but the maximal value of coherence drops to 0.85. In this case, the reduction in the calculated synchronization can be explained from the frequency smoothing associated with a shorter time window. Due to the narrowness of the frequency peak of the Rössler oscillator, a larger amount of frequency smoothing forces the coherence to be determined with a phase that includes both the peak itself and the surrounding frequencies.
The effect is similar when evaluating the mean coherence over a band, as coupled frequencies are averaged together with non-coupled ones. However, PLV does not show this effect, as the instantaneous phase of the signal is calculated over the whole band, and not from an average of frequencies. In this sense, PLV is superior to coherence, as the position and narrowness of the oscillator over the observed band are not relevant. PLV could even handle an oscillator with varying frequency, given that the oscillatory frequency never goes outside of the observed band. On the other hand, maximal coherence would only return the coupling at the most common frequency, ignoring the coupling at other frequencies, and mean coherence would dramatically underestimate the connectivity.

Effects of volume conduction
As commented before, one of the main criticisms to PLV and coherence is their sensitivity to volume conduction. Volume conduction can be modeled as an instantaneous projection of one signal onto the other, generating zero-lag synchronization. As shown in the Methods section, Guido Nolte's proposed imaginary part of coherency [12], [10] can be extended to the PLV, giving a set of zero-lag insensitive measures. In order to evaluate the behavior of these PLV derived metrics, we used the same pair of chaotic systems defined in the previous section. The systems show nonzero-lag phase synchronization, and the volume conduction can be introduced using instantaneous linear mixing [22], [23]: where is a parameter determining the amount of mixing. Figure 3 shows the synchronization estimated between the signals described in the previous section using the PLV, iPLV and ciPLV, and values of of 0, 0.1 and 0.2. Figure 3a shows the estimated synchronization for V=0 (without zero lag mixing), measured with PLV, iPLV and cIPV. Values of PLV and ciPLV are almost identical, indicating that the synchronization of the pair of the chaotic oscillator is nonzero lag. However, iPLV values are slightly lower, especially for high couplings, indicating that the lag between both signals is almost, but not exactly, one-quarter of a cycle, so the PLV complex vector has only a small real component. Figure 3b, corresponding to V=0.1, shows the effect of 10% volume conduction between the signals, introducing low spurious zero-lag synchronization. PLV values increase especially at lower couplings, whereas iPLV and ciPLV values remain unaffected. iPLV shows lower synchronization values that ciPLV, which remains almost identical to the V=0 case. This is because the complex PLV vector now has an important real component that is completely ignored by iPLV. This shows that ciPLV is superior, as it uses this real component to normalize the imaginary one. Figure 3c shows the effect of 20% volume conduction, with similar results, but exacerbating the estimation errors in PLV and iPLV. ciPLV, however, continues to extract the correct synchronization value.

Conclusions
In this paper, we propose a new formulation of the PLV algorithm that allows its fast computation by bypassing CPU-demanding calculations. The direct use of this new algorithm in Matlab allows a speedup by a factor 100 from a vector-optimized implementation. Even though this implementation can be improved, it is unlikely that a similar speedup can be achieved using the original formulation.
It is important to note that the optimization is only based in a mathematical reformulation of the original definition of PLV, in which the phase extraction and the exponentiation necessary for its calculation are replaced by a simple matrix multiplication. Importantly, the algorithm can be implemented in any language, and the code provided in the Appendix, created in Matlab, is completely machine independent. Thus, differently to a toolbox we recently released [24], there is no need to compile the source code in C language, which makes it much easier to use. Note also that this paper only deals with the mathematical definition, and there may be room for computational improvement.
In addition, this reformulation of the original PLV definition stressed the similarities between PLV and coherence. The comparison of both metrics as applied to a pair of chaotic systems with different levels of coupling showed that PLV and coherence values are closely related, with PLV showing a better behavior in the (very realistic) situation where the coupling takes place over a whole (possibly narrow) band instead of at a single frequency. This is, however, not a flaw of coherence, but the direct effect of its definition. PLV seems best to evaluate synchronization over a whole band, whereas coherence seems best to evaluate it at fixed frequencies.
Finally, we evaluated the behavior of PLV in the relevant case of volume conduction/source leakage as simulated by instantaneous linear mixing of the signals corresponding to the two systems. In analogy to coherency, we introduced the imaginary part of PLV and its corrected version. Both algorithms proved insensitive to volume conduction, but iPLV was flawed by the effect of a real component of the complex PLV vector. As in the case of the imaginary part of coherency, iPLV only can achieve the maximal value when the difference of phases of the two signals is exactly π/2 radians and drops to almost zero when the phase difference is small but consistently nonzero. ciPLV corrects this behavior in a similar way to the corrected imaginary part of coherency [10], being both unbiased and insensitive to volume conduction/source leakage effects. This is not the first time that a metric based on the imaginary part of PLV is used [25], but, to our knowledge, the expected behavior had not been thoroughly tested yet.
In conclusion, we have shown here that, despite its longevity, it is still possible to refine further the estimation of the (allegedly) most popular method of PS, the PLV index. The new formulation not only allows for a much faster estimation of the index but also, by highlighting the similarities between PLV and coherence, prompts the definition of two new measures insensitive to zero lag. These measures, derived from the vector product of the real and imaginary part of PLV, are analog to the (corrected) imaginary part of coherence and are therefore robust again volume conduction and source leakage effects. We hope these new developments will encourage further the use PLV-related measures in the analysis of neurophysiological data at the sensor and the source level.

Appendix. Implementations for the PLV calculation tested in the Results section
In the Results section of this paper, we tested three different implementations for the calculation of PLV. In this Appendix, we show the three different codes developed in Matlab to evaluate the behavior of each algorithm. In the codes provided, all three implementations are fed with the Hilbert analytic signal, with shape signals per samples per trials.
The first implementation includes two nested loops, so every pair of signals is evaluated independently. This option is the most memory conservative, as only two signals are evaluated in each interaction, but is the slowest one. The code is: [ nc, ns, nt ] = size( data ); phs = angle( data ); plv = zeros( nc, nc, nt ); for t = 1: nt for c1 = 1: nd for c2 = c1 + 1: nd dphs = phs( c1, :, t ) -phs( c2, :, t ); plv( c1, c2, t ) = abs( mean( exp( 1i * dphs ) ) ); end end end The second implementation is a vectorized and memory efficient version of the algorithm. In this implementation every signal is compared against all the other signals, avoiding two loops. This option uses slightly more memory but achieves a speedup of factor 2.5 over the original one. The code is: [ nc, ns, nt ] = size( data ); phs = angle( data ); plv = zeros( nc, nc, nt ); for t = 1: nt tplv = complex( zeros( nc ) ); for s = 1: ns dphs = bsxfun( @minus, phs( :, s, t ), phs( :, s, t )' ); tplv = tplv + exp( 1i * dphs ); end plv( :, :, t ) = abs( tplv / ns ); end The last implementation uses the proposed algorithm. This algorithm allows the direct calculation of all source pairs at once, with negligible memory increase. In addition to avoiding the angle and exp functions, the implementation removes the need to use a loop over the signals. The code is: [ nd, ns, nt ] = size( data ); ndat = data ./ abs( data ); plv = zeros( nd, nd, nt ); for t = 1: nt plv( :, :, t ) = abs( ndat( :, :, t ) * ndat( :, :, t )' ) / ns; end  The values of the synchronization indices analyzed for different couplings for a pair of Rössler and Lorenz systems. The dark line represents the mean value over 50 executions, and the shadow line represents the standard deviation. A. PLV, maximal and average coherence for the band between 0.570·π and 0.600·π radians per sample. Coherence was calculated using a Hamming window of length 400 with 200 samples of overlapping. B: The same that in A, using a window length of 100 samples with 50 samples of overlapping. Note that only coherencebased metrics are affected by this change.

Figure 3
The values of the synchronization indices analyzed for different couplings for a pair of Rössler and Lorenz systems. The dark line represents the mean value over 50 executions, and the shadow line represents the standard deviation. A: Synchronization estimated using PLV, iPLV and ciPLV. B: The same that in A, after adding a 10% of linear mixing to the signals. C: Idem after adding a 20% of linear mixing.