Improved functional connectivity network estimation for brain networks using multivariate partial coherence

Objective. Graphical networks and network metrics are widely used to understand and characterise brain networks and brain function. These methods can be applied to a range of electrophysiological data including electroencephalography, local field potential and single unit recordings. Functional networks are often constructed using pair-wise correlation between variables. The objective of this study is to demonstrate that functional networks can be more accurately estimated using partial correlation than with pair-wise correlation. Approach. We compared network metrics derived from unconditional and conditional graphical networks, obtained using coherence and multivariate partial coherence (MVPC), respectively. Graphical networks were constructed using coherence and MVPC estimates, and binary and weighted network metrics derived from these: node degree, path length, clustering coefficients and small-world index. Main results. Network metrics were applied to simulated and experimental single unit spike train data. Simulated data used a 10x10 grid of simulated cortical neurons with centre-surround connectivity. Conditional network metrics gave a more accurate representation of the known connectivity: Numbers of excitatory connections had range 3–11, unconditional binary node degree had range 6–80, conditional node degree had range 2–13. Experimental data used multi-electrode array recording with 19 single-units from left and right hippocampal brain areas in a rat model for epilepsy. Conditional network analysis showed similar trends to simulated data, with lower binary node degree and longer binary path lengths compared to unconditional networks. Significance. We conclude that conditional networks, where common dependencies are removed through partial coherence analysis, give a more accurate representation of the interactions in a graphical network model. These results have important implications for graphical network analyses of brain networks and suggest that functional networks should be derived using partial correlation, based on MVPC estimates, as opposed to the common approach of pair-wise correlation.


Introduction
The ability to infer and characterize interactions between individual neurons and groups of neurons is in characterising the interactions within brain networks is becoming well established [6][7][8][9]. Graphical brain networks can be derived from different modalities, these include anatomical, functional and effective connectivity [1]. Functional connectivity, used here, is concerned with characterising functional interactions between nodes, or signals, in a graphical network representation. Functional connections, or interactions, can be quantified using a range of measures. One common approach to determining functional connectivity is through correlation analysis. This can use time domain or frequency domain measures of correlation. In the frequency domain a common measure of functional interaction between two neuronal signals, is provided by the coherence function [10].
Our approach is exclusively frequency domain based on estimates of coherence and partial coherence functions. These provide a normative measure of correlation or partial correlation between neuronal signals on a scale [0, 1] as a function of frequency. They can be applied to time series and spike train data, and have well established methods for estimation and setting confidence limits [10,11]. Ordinary coherence provides a pair-wise measure of functional interaction. In large networks this approach may be confounded by the presence of common influences. Ordinary coherence cannot distinguish direct connections from common inputs. For example neurons A and B which receive common input from neuron C will exhibit a pair-wise coherence between A−B. Multivariate partial coherence (MVPC), analogous to partial correlation in multiple regression, can be used to assess whether an observed correlation is due to a direct connection or due to common influences. In situations where putative common inputs are available as additional signals, an MVPC estimate will distinguish between direct connections and common inputs [10,12]. There are a number of advantages to working in the frequency domain as discussed in [10], the estimation of parameters and setting of confidence limits is more straightforward, and the characterisation of neural data is often according to well-established frequency bands [9]. If required, partial correlation measures in the time domain can be derived using partial spectra [13].
A number of approaches are available for constructing graphical networks that describe functional interactions between multivariate neuronal signals. Coherence and partial coherence, as used here, are based on the concept of correlation and partial correlation [6,8,9,14]. Approaches based on correlation and partial correlation construct graphical networks based on linear interactions, the absence of an edge is based on the null hypothesis that a pair of nodes are uncorrelated when the linear effects of any predictors are taken into account [12].
Information theoretic measures have been used to construct graphical networks using mutual information (MI) [15]. Graphical networks using MI are constructed against a null hypothesis of statistical independence, a more rigorous condition than zero correlation [16]. However, reliable MI estimators are typically more challenging to construct and to test for statistical significance [17]. In the linear Gaussian case MI can be derived from coherence [18].
Graphical networks can also be constructed by identifying functional subgraphs in larger networks, using the concept of motifs [19,20]. Functional subgraphs can be identified using information theoretic measures [20] or coherence and partial coherence [21]. The focus here is on identifying a single graph to describe the functional relationships between all available signals.
Novelty in the present study is the use of network metrics derived from MVPC. When compared against known network structure using simulated data, network metrics derived from MVPC give a more accurate representation than those derived from ordinary coherence. Validation is through application to 100 simulated spike trains from a 10 × 10 grid of simulated cortical neurons with centre-surround connectivity. Application to experimental data uses a 19 channel multi-electrode array (MEA) recording in two subregions from left and right hippocampus in the rat. Section 2 describes the calculation of network metrics from coherence and partial coherence estimates. Section 3.1 considers scalability of the approach and section 3.2 compares network metrics derived from coherence and partial coherence for simulated data with known connectivity. Section 3.3 describes graphical network analysis of the MEA data set using simple and compound network metrics. Section 4 provides discussion and conclusions.

Methods
Our framework uses a two stage approach that combines multivariate spectral analysis with network theoretic measures. The spectral analysis is used to generate the adjacency matrix that describes the interactions between the signals. The adjacency matrix can be either binary or weighted, and is determined from the magnitude of coherence estimates or the magnitude of MVPC estimates. Network measures are derived directly from the adjacency matrix. In this section we describe the calculation of coherence and MVPC estimates used to generate the adjacency matrix, then describe the construction of network measures.

Coherence estimation
Coherence is a widely used measure of association, which has been applied to time series data [22], point-process or spike train data [10,23] and hybrids of the two [11]. Here we are considering neuronal spike trains represented as stochastic point processes.
The modelling of spike trains as point processes is described in [10]. For two point-processes N 1 and N 2 , the coherence at frequency λ, |R 21 (λ)| 2 , is defined as where f 21 (λ) is the cross-spectrum between processes N 1 and N 2 , and f 11 (λ), f 22 (λ) are the individual autospectra. Estimation is achieved by direct substitution of spectral estimates into equation (1). Estimates of the individual spectra can be achieved using a range of approaches. Here we use Welch Overlapping Segment Average (WOSA) spectral estimates, a widely used non-parametric approach, which breaks long records down into segments and spectra are estimated by averaging over segments [24]. We use nonoverlapping segments where a record with R samples is sectioned into L segments of length T, R = LT.
Denoting the discrete Fourier transform of section l from process N 1 as d T N1 (λ, l), with a similar notation for d T N2 (λ, l), the cross-spectrum f 21 (λ) is estimated aŝ where the hat symbol ,ˆ, denotes an estimate and the overbar indicates a complex conjugate. Estimates of auto-spectra,f 11 (λ),f 22 (λ) are obtained by using the appropriate discrete Fourier transforms in equation (2).

Partial coherence estimation
First order partial coherence estimates are used to test the hypothesis that the correlation between two spike trains, N 1 and N 2 can be attributed to the common influence of a third spike train N 3 . In this scenario estimates of the partial coherence, |R 21/3 (λ)| 2 , should show no significant features compared to the estimated ordinary coherence |R 21 (λ)| 2 . Partial coherence with a single predictor can be defined and estimated using manipulation of auto-and cross-spectra between the three spike trains N 1 , N 2 and N 3 . This approach is described in [12]. Higher order partial coherence estimates, using a number of spike trains as predictors, are defined using a matrix formulation. Here we follow the approach of [25] which relies on inversion of the spectral matrix. For a set of r point processes, N 1 , . . . , N r , the spectral matrix f N (λ) is constructed as where the diagonal entries are auto spectra and the off-diagonal entries are the cross-spectra between pairs of point-processes. This spectral matrix has Hermitian symmetry and can be inverted provided the matrix is not ill-conditioned, i.e. that sufficient degrees of freedom have been used in constructing the auto-and cross-spectral estimates used to construct f N (λ) [26]. For the estimate in equation (2) this implies L > r. In practice it is usual to have L ≫ r, if L is only slightly larger than r then additional techniques can be applied to improve the reliability of MVPC estimates [27]. The inverse spectral matrix, f −1 N (λ), provides the components necessary to estimate higher order partial coherence functions. Denoting the inverse spectral matrix as g(λ) = f −1 N (λ), the diagonal entries of this, g ii (λ), can be manipulated to form a diagonal matrix d(λ) as An r × r matrix of partial coherence functions, |R N (λ)| 2 , is defined as [25] |R The partial coherence between point processes i and j with the other (r − 2) processes as predictors, where N (ij) refers to all entries in N apart from i and j, is obtained from entry (i, j) of matrix |R N (λ)| 2 , i ̸ = j. Other approaches are possible to define and calculate partial coherence [12,22] but the matrix inversion method is more efficient for large data sets [26]. A number of potential pitfalls are present in using conditional networks derived from MVPC estimates. One of these is divergent effects, as illustrated by a three node network where nodes A and B connect to node C, and not to each other. A bivariate coherence analysis will show no significant interaction between A − B, in contrast a partial coherence estimate conditioned on node C may be significant depending on the strength of connections from A and B onto C. This confounding effect can be avoided by only considering interactions where there is significant bivariate (ordinary) coherence between pairs of nodes. If there is no bivariate coherence, there is no pairwise correlation for the predictor(s) to explain, in this case no edge is included in the network.

Edge detection using statistical filtering of coherence or partial coherence
The approach in this study is to construct functional graphical networks that are consistent with the pattern of linear dependencies seen between the signals, and use network metrics to quantify the characteristics of these functional networks. Linear dependencies are estimated using coherence and MVPC, which provide normative measures of correlation or partial correlation on a scale [0, 1] as a function of frequency. Statistical filtering of coherence or MVPC estimated between two nodes is required to determine if an edge is present in a graphical network. This is equivalent to determining if a coherence or MVPC estimate is statistically significant at a specific frequency or over a specified range of frequencies using confidence limits. The setting of confidence limits for coherence and MVPC estimates is discussed in [10], using a WOSA estimate with L non-overlapping segments gives a confidence limit at significance level α for a null hypothesis of zero coherence of 1 − α 1/(L−1) . We use α = 0.05 to set 95% confidence levels. For an MVPC estimate constructed from L segments using k predictors, the 95% confidence limit is estimated as the increased value compared to the ordinary coherence takes into account the loss of degrees of freedom in the MVPC estimate compared to the coherence estimate [10]. We refer to networks derived from coherence as unconditional networks, and those derived from MVPC as conditional networks, since MVPC estimates are conditioned on a set of common predictors. Adjacency matrices are determined from coherence estimates for unconditional networks and from MVPC estimates for conditional networks. In both cases, the average coherence or MVPC in a predetermined frequency range is compared against a 95% confidence limit. If the average exceeds the confidence limit then a ij = 1 and w ij is set equal to this average, otherwise a ij = w ij = 0. The quantities a ij and w ij are defined in section 2.4. The frequency ranges used are defined in section 3.1 for simulated data and in section 3.3 for the experimental data.
This approach to determining a ij and w ij draws on the extensive theory for time series analysis and avoids the need for surrogate data. Similar approaches and further discussion are in [6,9,10,26,29].

Network metrics
Network science provides important insight into complex systems that are composed of individual components linked together. Networks can be visualised using a graphical representation and network properties can be quantified through application of network metrics [1,3,8,28]. We use the approach in [1] as the framework within which to estimate network metrics, which describes the use of anatomical, functional and effective networks. Functional networks, used here, are based on the concept of functional interactions between signals, which treats each signal as a node and seeks to establish a set of edges that describe the pattern of correlation or partial correlation between the variables. Our approach determines the number of edges for each node through the significance of coherence, or partial coherence, estimates. Statistical filtering determines if an edge should be included, as described in section 2.3. If coherence or partial coherence estimates are statistically significant then an edge is included in the graph, and the weight of the edge is determined from the mean coherence or mean partial coherence over the specified frequency range.
Network theory metrics start from the adjacency matrix, A, a 2D matrix which describes any edges linking the N nodes. The adjacency matrix can be binary, A b , containing the values 0 or 1, or weighted, A w , where each edge is given a weight. The individual elements are denoted as a ij for A b , and as w ij for A w . Adjacency matrices are symmetrical for undirected networks: a ij = a ji , w ij = w ji . Here a ij represents the presence or absence of an edge between nodes i and j, w ij represents the strength of interaction, 0 ≤ w ij ≤ 1.
Adjacency matrices are determined from coherence estimates for unconditional networks and from MVPC estimates for conditional networks as described in section 2.3. In both cases the average coherence or MVPC in a pre-determined frequency range is compared against a 95% confidence limit. If the average exceeds the confidence limit then a ij = 1 and w ij is set equal to this average, otherwise a ij = w ij = 0. This approach to determining a ij and w ij is considered in [29].
We consider two simple metrics and two compound metrics in this study. The two simple metrics are node degree and path length. The compound metrics are clustering coefficient and small-world-ness. These metrics are estimated for binary and weighted network representations.
Node degree is the most important measure in network theory, all network measures are derived from node degree or node weight. Node degree in binary networks is defined as k i = ∑ j∈N a ij , 0 ≤ k i ≤(N − 1), which counts the number of edges that connect a node to the rest of the network. The weighted degree, or node strength, is the sum of all weighted connections to a node: k w i = ∑ j∈N w ij [1]. The path between two nodes is an ordered sequence of all possible routes through the network connecting two nodes. The shortest path in a binary network, d ij , is calculated as the sum of a ij in the shortest pathway. In a weighted network the shortest path d w ij is calculated as the sum of f in the shortest path, where f(·) maps weight to length, here we use the inverse function [1] applied to the mean coherence or mean MVPC in the pre-determined frequency range. The average d ij between a node and all other nodes in a binary network is denoted as L i , the average of these, L, is the characteristic path length for the network [1,30]. For weighted networks the characteristic path length L w is based on the average d w ij . Characteristic path lengths provide an indication of global integration between nodes. Regular networks typically have longer path lengths compared to random networks, where long range connections decrease the characteristic path length [30].
Clustering coefficients are an example of a compound network metric which measures local network properties, in this case the fraction of nodes connected to node i that are themselves connected. For a binary network the clustering coefficient, C, is defined as the mean C i across all N nodes. C i is the clustering coefficient for node i, , where t i is the number of triangles around node i, determined from a ij . For node degree k i < 2, C i = 0. A weighted clustering coefficient C w i can be calculated from w ij , the weighted network clustering coefficient C w is the mean of these [31].
A small-world network is one that exhibits both functional integration and functional segregation. This is achieved with many local connections and a few long distance connections [30]. In terms of network measures, a small-world network should have a similar characteristic path length and a larger clustering coefficient when compared to a random network. This leads to the definition of a small-world metric for binary, S, and weighted networks, S w , as [30,32] where the subscript 'rand' indicates a measure calculated from a random graph. Small-world networks should have S > 1, S w > 1 [32]. The detection of smallworld network topology is an important aspect of understanding brain function [7], we investigate calculation of small-world metrics from MVPC estimates in section 3. We use Erdös Rényi random networks with the same number of nodes and edges as the network under analysis. These are generated by randomly assigning an edge between a pair of nodes with equal probability, which creates random networks with a uniform node degree distribution [32]. Weighted random networks are constructed by random permutation of weighted edges from the original network [33]. Separate random networks are constructed each time the small world metric, equation (7), is calculated using a single random network. Erdös Rényi networks have been used in calculation of small-world metrics with bivariate and multivariate correlation measures [34].

Results
We consider application of coherence and MVPC based network metrics to simulated and experimental data. Our hypothesis is that network metrics derived from MVPC should give a more accurate representation of network interactions than those derived from ordinary coherence estimates. Section 3.1 demonstrates calculation of the adjacency matrix from MVPC estimates and considers scalability of MVPC estimates. Simulated data allows the approach to be validated by comparison of network metrics with the known network connectivity, section 3.2. Application to experimental data is demonstrated on a MEA data The simulated data is generated using a cortical neuronal network simulation. One hundred neurons are arranged in a 10 × 10, 2D sheet, with 75% excitatory and 25% inhibitory neurons, with location determined randomly. Individual neurons are modelled using a conductance based cortical neuron model [35]. Each neuron is activated by independent populations of excitatory and inhibitory inputs. In addition, each excitatory neuron makes 12 excitatory synaptic connections with its nearest neighbours, each inhibitory neuron makes 12 inhibitory connections with a ring of neurons just outside its nearest neighbours. The synaptic connections do not wrap around at the edges of the network. This centresurround connectivity is designed to promote extensive interactions in the network, many neurons have reciprocal connections, and receive a mixture of excitatory and inhibitory input, making it challenging to estimate network degree. Figure 1 shows a schematic of the 10 × 10 cortical neuron network simulation with excitatory neurons in blue and inhibitory neurons in red. A full description of the network simulation is reported elsewhere [35]. We first use the simulated spike trains to demonstrate calculation of the adjacency matrix from MVPC and consider scalability of MVPC estimates.

Detection of network edges and scalability of partial coherence estimates
This section considers how edges are determined from coherence and partial coherence estimates using confidence limits. It also considers the scalability of partial coherence estimates as the number of predictors is systematically increased.   For neuron pair 44-55, figure 2(a), nine of the 10 predictor neurons are excitatory with connections to neurons 45 and 55, we expect to see a systematic reduction in MVPC magnitude as more neurons which supply common input are included as predictors. The ordinary coherence estimate has a peak value of 0.57, the magnitude of MVPC estimates decreases systematically from a peak value of 0.4 for a single predictor to around 0.033 for the 10th order MVPC estimate. A coherence of 0.57 suggests that 57% of the variability in neuron 55 can be predicted from a knowledge of the firing times of neuron 45 (and vice-versa). Given the independent background synaptic inputs and the extensive synaptic connections present, this is likely to represent an overestimate of the effects of the direct synaptic connections between these two neurons. Inclusion of the 10 immediate neighbour neurons as predictors results in an MVPC estimate which fluctuates around 0.02 (figure 2(a) red line). This may be a more accurate estimate of the effects of the reciprocal synaptic interactions between the two neurons once common influences from neighbouring neurons are removed. For an MVPC estimate with k = 10 predictors and L = 292 segments the upper 95% confidence limit is 0.010 6, equation (6). The average MVPC up to 30 Hz is used to determine the presence of an edge in the network, this represents the frequency range where coherence is strongest. The average coherence up to 30 Hz for this example is 0.36 and the average partial coherence is 0.015 3. Since these both exceed their respective confidence limits an edge is included in the adjacency matrix between neurons 44 and 55 in unconditional and conditional networks. Figure 2(b) considers a more distant pair of neurons, 45-95. There is no direct connection between these two neurons in the simulated network, thus we would not expect an edge to be present between this pairing. The coherence estimate (black line) has a peak value of 0.051 and the average up to 30 Hz is 0.013 1, which is above the 95% confidence limit (dashed line) so an edge is included in the unconditional network. All partial coherence estimates show a clear reduction in magnitude, the average up to 30 Hz for the estimate with ten predictors is 0.003, thus no edge is included in the conditional network.
The examples in figure 2 demonstrate scalability of MVPC estimates with up to k = 10 predictors. In section 3.2 MVPC estimates with k = 98 predictors are used to establish the presence of edges in the network using the same approach.
As an alternative to the frequency domain approach, figure 3 illustrates time domain correlation and partial correlation estimated using covariance and partial covariance functions calculated from cross spectra and partial cross spectra [13].
The time domain analysis in figure 3 leads to similar conclusions regarding the presence of unconditional and conditional links between the two pairs of nodes in the graphical network. In particular there is no evidence for a conditional link between nodes 45-95, figure 3(d), in contrast to the conditional estimate for nodes 45-55 ( figure 3(c)). This is as expected from the known network connectivity. One advantage of a frequency domain approach, based on coherence and partial coherence, is that weighted estimates are more straightforward to derive from these normative measures compared to the unbounded time domain measures in figure 3. Normalisation of time domain measures is discussed in [13].

Unconditional and conditional network metrics -simulated cortical neuron network data
In this section we compare estimated node degree and path length for unconditional and conditional networks against our expectations based on the connectivity in the simulated cortical neuron network. Figure 4 shows heat maps of binary node-degree, a ij , derived from coherence and MVPC estimates. Node degree was estimated using the average coherence or average MVPC at frequencies up to 30 Hz as explained in section 3.1. Unconditional node degree estimates range from 6 to 80, conditional node degree from 2 to 13, determined from estimates with L = 292 segments, with segment length T = 1024 and a sampling interval of 1 ms.

Simulated data -Node degree
The mean(SD) of a ij across the 100 nodes are 48. 8(22.3) for the unconditional network and 7.7(2.8) for the conditional network. This reduction is a consequence of the reduction in magnitude of MVPC compared to ordinary coherence estimates, figure 2, where common influences across neuronal firing are removed prior to estimating conditional node degree.
Inhibitory connections in the simulated network use shunting inhibition to counteract simultaneous excitatory inputs [35]. Therefore, to obtain a benchmark for assessing the accuracy of node degree estimates we consider the number of excitatory connections to each neuron as a target, a ij(T) . These a ij(T) depend on the proximity of each neuron to the network edge and the location of inhibitory neurons, and vary from 3 to 12 across the 100 neurons. To quantify the improved performance of conditional (multivariate) estimates of binary node degree compared to unconditional (bivariate) estimates we use the absolute difference between the target and estimated node degree, |a ij(T) − a ij(E) |. Table 1 shows the range,

Simulated data -Path length
Path length is a useful network metric for quantifying pathways between nodes. Here we consider shortest path length between two nodes, d ij . We can exploit the symmetry in functional networks where d ij = d ji to combine shortest path lengths in a single figure using i < j for unconditional path length and i > j for conditional path length. This combined representation is shown in figure 5 for all connections in the 10 × 10 network.
Considering binary path lengths allows comparison with the known path length in the centre surround network, where excitatory connections extend 1 node in the diagonal directions and 2 nodes in the horizontal and vertical directions [35]. Thus  we would expect to see a maximum path length of 5 in the horizontal/vertical direction within a single row/column, traversing the 10 nodes in steps of two, and a maximum path length of 9 in the diagonal direction, traversing from node to node in single steps. The range of estimated shortest path lengths for the unconditional network is [0, 4]. The majority of connections have d ij of 1 (49.3%), 2 (42%) or 3 (8.4%). This is too small to account accurately for the centre-surround connection structure in the network. The conditional network has a range of [0,8] for the estimated d ij . Table 2 shows the range, mean and SD of the absolute difference between the target and estimated shortest path length, |d ij(T) − d ij(E) |, for unconditional and conditional networks between the 75 excitatory neurons in the network. A two-sided Wilcoxon rank sum test suggests the reduction in this error for the multivariate estimates of path length compared to the bivariate estimates is highly significant: p < 0.001.
In summary, node degree and binary path length estimated for conditional networks using MVPC agree more closely with our expectations based on the known connectivity in the simulated cortical neuron network. This is in contrast to the unconditional network metrics estimated from ordinary coherence which overestimate degree and underestimate path length using the same data.

Unconditional and conditional network metrics -experimental multielectrode array data
In this section we apply the unconditional and conditional network metrics to single unit experimental MEA recordings from a study investigating intra-and inter-hippocampal connectivity in a model of kainic acid (KA) induced mesial temporal lobe epilepsy (mTLE) in rat [36]. These experiments explore the changes in neuronal firing and brain connectivity during seizure. Our hypothesis for the experimental data is that the differences seen between unconditional and conditional network metrics for the simulated data should also be present in experimental data. Based on the results from simulated data, we expect conditional metrics to give a more accurate view of functional interactions in the experimental data compared to unconditional metrics. Application to the experimental data set investigates whether conditional and unconditional metrics are sensitive to alternations in brain state and whether network metrics can discriminate between epileptic and baseline brain states.
Multiple single-units were recorded simultaneously using a Plexon Multichannel Acquisition Processor system, neural activity was processed using 30 60 90  120  150  180  210   1  2  3  4  5  6  7  8  9  10  11  12  13  14  15  16  17  18  19 Channel MEA data -Unconditional node degree Offline-Sorter and Neuroexplorer to extract single unit spike train activity in the CA1 and CA3 regions of left and right hippocampus [36,37]. All procedures had ethical approval and were carried out in accordance with the Animals (Scientific Procedures) Act 1986, UK. The recording protocol consisted of (1) a 10 min baseline recording; (2) local saline injection after 10 min; and 3) local injection of KA after 30 min (1 mM, 1µL). Local injections were to left hippocampus. The duration of the recording is 215 min. A total of 19 single units were recorded, with spike times defined using a sampling interval of 1 ms. Channel locations are 1-5: CA3 Left, 6-10: CA1 Left, 11-14: CA3 Right, 15-19: CA1 Right. For network analysis the 215 min was split into 5 min blocks with spectra, equation (2), coherence, equation (1), and MVPC, equation (5), estimated for each 5 min block using T = 1024, L = 292 and k = 17 predictors for MVPC estimates. Network metrics are calculated from these coherence and MVPC estimates, using the procedure described in section 2.4. For this data coherence and partial coherence estimates are significant up to 70 Hz, network metrics are based on the average over the frequency range 0-70 Hz. The 95% confidence limits for coherence with L = 292 is 0.010 2 and for MVPC with L = 292 and k = 17 is 0.010 9. Figure 6 shows unconditional and conditional a ij for the 19 channels, using 43 blocks of 5 min. shows the mean and SD of a ij estimates over different time periods in min: 0-215 (complete record), 0-30 (baseline), 30-60 (after KA injection), and 90-120 min (60 min after KA injection). Mean and SD in table 3 are averages across all 19 channels using values from the relevant 5 min blocks. Like the simulated data, conditional node degrees are lower for the MEA data. The maximum node degree for the unconditional network is 13, that for the conditional network is 10. Table 3 shows that node degree is, on average, 2-3 times smaller for the conditional network compared to the unconditional network. Two-sided Wilcoxon rank sum tests suggest this reduction in node degree is highly significant, with p < 0.001 for all 4 time periods. Considering the node degree estimates in the individual five min time blocks, two sided Wilcoxon rank sum tests on individual time blocks show a significant difference in 42 of the 43 time blocks (98%), with 42% having p < 0.05, 47% having p < 0.01, and 9% having p < 0.001. Thus, for this data, there are significant reductions in conditional estimates of node degree compared to unconditional estimates across the entire recording. There is a higher level of variation in unconditional degree, both across channels (vertical) and over time (horizontal). Table 3 shows that the SD of conditional network node degree is around half that seen for the unconditional network.

Experimental data -Node degree
Mean and SD of node degree before (0-30 min) and immediately after KA injection (30-60 min) show opposite trends, increasing for the unconditional network and decreasing for the conditional network (table 3). The reason for this is unclear, however it might reflect an increase in collective activity in the network during the epileptic state at the expense of individual interactions, a scenario that would create increased coherence between units and decreased partial coherence when effects of this common collective activity are removed.
One node that has a consistently higher degree in the conditional network is channel 13, a CA3 Right unit, which has a maximum degree of 9, and is connected to single units in all 4 recording areas. This node could be considered as a network hub. Channel 13 has no specific features that mark it out as unusual in terms of spike train statistics. The mean firing rate across the 19 units is 3.7 spikes/sec, range [0.07, 22] spikes/sec. Channel 13 has the second highest firing rate, 12.7 spikes/sec. Channel 5 has the highest rate 22 spikes/sec and channel 10 has a similar firing rate of 11.9 spikes/s, neither of these are identified as a hub node. The point-process spectral estimate [11] of channel 13 does not differ from the spectra of other channels (data not shown). The identification of this hub node appears to reflect its different connectivity rather than firing properties. This functional role for channel 13 is hidden in the unconditional network where most nodes have connections to all subregions.

Experimental data -Path length
Box plots of weighted, d w ij , and binary, d ij , path lengths in unconditional and conditional networks for each 5 min block up to 120 min are shown in figure 7. We consider behaviour over 120 min as this includes a 90 min period after KA injection to study how network measures respond to the transition from baseline to a pathological state. Characteristic path lengths, L w and L are shown as black dotted lines in figure 7. The overall averages for L w are 58 and 97 for the unconditional and conditional networks, respectively, an increase of 60% for conditional compared to unconditional L w . For the binary networks the overall averages for L are 1.68 and 2.2 for the unconditional and conditional networks, respectively, an increase of 31% for conditional compared to unconditional L. Considering the binary path length estimates in the individual five min time blocks, two sided Wilcoxon rank sum tests on individual time blocks show a significant difference in 35 of the 43 time blocks (81%), with 9% of time blocks having p < 0.05, 9% having p < 0.01, and 63% having p < 0.001. Thus, for this data, there are significant increases in conditional estimates of binary path length compared to unconditional estimates.

Experimental data -Clustering coefficient
Box plots of binary, C, and weighted, C w , clustering coefficients for each 5 min block are shown in figure 8 for conditional and unconditional networks. These measure local connectedness of the network, by estimating the fraction of triangles around each node which have three edges in the binary case [30] or using an intensity based measure in the weighted case [31]. For unconditional binary networks the range across time blocks for the median of C is [0.5, 0.82], the range for the median of C w is [0.019, 0.042]. Conditional networks show reduced levels of clustering compared to unconditional networks. Binary and weighted networks both have 17 time blocks with a median of zero, indicating more than 50% of nodes have no clustering.
Considering binary clustering coefficient estimates in the individual five min time blocks, two sided Wilcoxon rank sum tests on individual time blocks show a significant difference in 17 of the 43 time blocks (40%), with 28% of time blocks having p < 0.05 and 12% having p < 0.01. In all cases the mean conditional clustering coefficient in each time block is smaller than the mean unconditional clustering coefficient.

Experimental data -Small-world-ness
An estimate of network small-world-ness can be constructed using path length and clustering coefficient values according to equation (7). The characteristics of a small-world network are higher C or C w and similar L or L w when compared to random networks. These reflect an optimum connection strategy where segregation and integration properties coexist across channels [30]. We use the weighted definition comparing C w and L w derived from coherence and MVPC for the MEA data against random networks. A network is said to be a small-world if C w ≫ C w rand , L w ≳ L w rand and S w > 1 [32]. Figure 9 shows plots of the two normalized network measures C w /C w rand and L w /L w rand along with S w for each 5 min block for unconditional (upper) and conditional (lower) networks. The path length ratio, L w /L w rand , has a mean of 1.85 for the unconditional network and a mean of 1.1 for the conditional network, a decrease of 40%. The mean value for the conditional network is closer to the expected of 1 for a small-world network. The clustering coefficient ratio, C w /C w rand , has a mean of 2.0 for the unconditional network and a mean of 4.8 for the conditional network, an increase of 140%. The larger values for the conditional network are more indicative of small-world behaviour, C w /C w rand ≫ 1. The small-world index in figure 9 has a mean of 1.3 for the unconditional network and a mean of 5.0  for the conditional network. A borderline condition for S w is proposed in [32], 1 ≤ S w ≤ 3. For the unconditional network, figure 9 (upper), 10 time blocks have S w < 1, 13 time block are marginal, 1 ≤ S w ≤ 3, and 1 time block has S w > 3 (115 min). The conditional network has 1 time block with S w < 1, 11 time blocks with 1 ≤ S w ≤ 3 and 12 time blocks with S w > 3. Eleven of the time blocks where S w > 3 in the conditional network are after KA injection at 30 min, suggesting an increased tendency for small-world organisation in the pathological condition. A functional magnetic resonance imaging study of mTLE in humans [38] found a tendency for increased C/C rand , decreased L/L rand and increased small-world index S in patients compared to control subjects, and suggested altered small-world properties in mTLE as a potential bio-marker. The present results suggest that it is important to consider conditional networks when estimating small-world properties.   4 shows results from pair-wise two sided Wilcoxon rank sum tests comparing the non-epileptic and epileptic states. Node degree, k i , and path length, d ij , both exhibit a significant change in unconditional metrics in contrast to conditional metrics, where differences are not significant. Binary clustering coefficients, C i , exhibit significant changes in unconditional and conditional metrics between the two states. However, the changes have different signs, with clustering increasing in the unconditional network and decreasing in the conditional network during the epileptic brain state. Interestingly, although differences in conditional network binary node degree and conditional path length are not significant between states, they have the opposite sign to the significant differences in the unconditional networks. Thus, conditional network binary node degree shows a decrease in the epileptic brain state in contrast to the significant increase seen in the unconditional network.

Distinguishing epileptic and non-epileptic brain states using network metrics
The results in table 4 suggest that network metrics are sensitive to differences between non-epileptic and epileptic states. Unconditional networks are more densely connected and have shorter path lengths and increased clustering in the epileptic state. In contrast, conditional networks show the opposite trend, with significantly smaller clustering in the epileptic state.

Differences between conditional and unconditional networks
This paper compares network metrics derived from unconditional and conditional graphical networks using coherence and MVPC estimates, respectively. We consider two elementary network metrics: node degree and path length and two compound metrics: clustering coefficient and small-world-ness. Simulated data from a 100 neuron 10 × 10 grid of cortical neurons allows comparison of metrics with the known connectivity. Absolute errors between estimated and target node degree, table 1, and path length, table 2, were an order of magnitude smaller for the conditional network compared to the unconditional bivariate network. These differences in binary node degree error and path length error estimated from multivariate compared to bivariate coherence were both highly significant, p < 0.001. Thus, for this simulated data set, MVPC based network measures are significantly more accurate compared to the known network connectivity than their bivariate counterparts.
The differences in multivariate and bivariate network metrics were also present when applied to experimental MEA recordings in left and right hippocampus in anaesthetised rat: Node degree was lower by a factor of 2-3 for conditional networks compared to unconditional networks, table 3. Conditional networks had longer path lengths compared to the unconditional network with binary and weighted characteristic path lengths 31% and 60% higher, respectively, figure 7. The differences in binary node degree were significant in 42 of the 43 five min time blocks (98%), the differences in binary path length were significant in 35 of the 43 time blocks (81%). The presence of significant differences in MVPC based measures compared with coherence based measures, in tandem with similar trends and the improved accuracy seen in simulated data with known connectivity, suggests that conditional network measures may provide a more accurate description compared to unconditional networks.
Systematic differences were also seen in compound metrics for the MEA data. Conditional networks had significantly smaller binary clustering coefficients in 17 time blocks (40%), and an increased tendency to observe small-world-ness, figure 9, compared to unconditional networks. Unconditional and conditional metrics were sensitive to differences in the non-epileptic and epileptic network states, section 3.3.5. However, unconditional and conditional networks altered in different ways during KA induced epilepsy, see table 4 for summary. Differences between unconditional and conditional networks for MEA data need further investigation to determine if they can provide reliable bio-markers for different brain states.

Reliable MVPC estimation
The ability to accurately estimate conditional network metrics relies on MVPC estimates. Scalability was considered in section 3.1. An important consideration is that records are sufficiently long to allow reliable estimates to be constructed, where the degrees of freedom exceed the number of available channels. Additional considerations regarding stability of matrix inversion have been investigated in [26] where diagonal up-weighting was incorporated as a means of stabilising the matrix inversion to generate g(λ). We have not incorporated diagonal up-weighting in our matrix inversion process, in part because we are working with point-process spectra which have constant asymptotic power at all frequencies [10]. Our approach to detecting the presence of an edge is to compare the integrated coherence or partial coherence over a predefined frequency range against the 95% confidence interval for each estimate. This requires the frequency range of interest to be defined in advance of calculating network metrics. An alternative approach is to define a frequency band of interest according to accepted classifications (e.g. alpha, beta, gamma) [9]. For situations involving multiple experiments an approach using partial coherence in combination with a false discovery rate across individual graphs is described in [9]. The issue of reliable MVPC estimation and edge detection is an area of ongoing research.

Reliable estimation of small-world characteristics
We have used two simple network measures and two compound measures to illustrate differences between conditional and unconditional networks. One compound measure is a small-world metric. Our main finding is that there are clear differences in the estimation of this small-world coefficient, equation (7), depending on whether unconditional or conditional functional networks are used, see figure 9. A number of additional factors should be taken into account before labelling the MEA data as a small-world network. The construction of small-world metrics using correlation and partial correlation is discussed in [14], who found that when using random networks correlation based measures showed a tendency for increased values of clustering coefficient. In contrast, partial correlation based measure showed a tendency for reduced values of clustering coefficient. Since the normalised clustering coefficient forms the numerator of the small-world measure in equation (7), these tendencies may well impact on this measure. Interestingly, our analysis showed the opposite effect, where the conditional networks showed increased clustering compared to the unconditional networks for the same data. A possible reason for this opposite effect is the use of experimental data with non-random interactions. A number of other factors that could bias smallworld estimates are discussed in [39].
A related issue is the motivation for use of graphical network representations and network metrics. In the case of MEA data the number of channels available should be sufficient to get reliable indicators, particularly for compound metrics. MEA data sets should be screened to establish that patterns of correlation and partial correlation are suitable to represent as a graphical network prior to calculation of network metrics.

Latent variables and size of conditioning set in conditional networks
One concern in the construction of conditional graphical networks is the presence of latent variable effects which may lead to spurious correlations [40]. A partial coherence approach to constructing conditional networks can remove linear effects of variables that are included in the data set. The approach we adopt has the flexibility to incorporate both time series and spike train data, thus for single unit data a simultaneously sampled local field potential could be incorporated as a predictor if required. Alternative approaches to latent variables are discussed in [40].
Our approach to calculating conditional networks is to use MVPC estimates, where for r processes all r(r − 1)/2 partial coherence estimates are generated in a single calculation at each frequency with a single matrix inversion operation using the approach in [25]. This approach assumes that the conditioning set for each pair of processes is the remaining processes. If it is required to construct a conditional network where MVPC estimates are conditioned on specific sub-sets of predictors, then the approach in [10,22] may be preferable which defines specific groupings of inputs (predictors) and outputs in a multivariate linear model. Although more computationally intensive, this approach gives more flexibility in altering conditioning sets in construction of graphical networks. Apriori knowledge of appropriate groupings for conditioning variables could be used to limit the search space. In our case, all spike-trains are treated on an equal basis, so MVPC estimates are conditioned on all remaining processes for simulated and MEA data.

Concluding remarks
Network analysis is an important tool for quantitative analysis of complex systems including brain networks [3]. We have demonstrated that more reliable measures can be obtained from conditional networks using MVPC than from unconditional networks based on pair-wise correlation. To our knowledge this is the first study to undertake such a comparison using realistic numbers of channels, MVPC estimates for simulated data had k = 98 predictors, for the MEA data k = 17 predictors were available. Future work could explore the broader applicability of this approach using a wider range of simulations and data modalities, for example EEG, MEG and fMRI.