High-resolution directed human connectomes and the Consensus Connectome Dynamics

Here we show a method of directing the edges of the connectomes, prepared from HARDI datasets from the human brain. Before the present work, no high-definition directed braingraphs were published, because the tractography methods in use are not capable of assigning directions to the neural tracts discovered. Previous work on the functional connectomes applied low-resolution functional MRI-detected statistical causality for the assignment of directions of connectomes of typically several dozens of vertices. Our method is based on the phenomenon of the “Consensus Connectome Dynamics”, described earlier by our research group. In this contribution, we apply the method to the 423 braingraphs, each with 1015 vertices, computed from the public release of the Human Connectome Project, and we also made the directed connectomes publicly available at the site http://braingraph.org. We also show the robustness of our edge directing method in four independently chosen connectome datasets: we have found that 86% of the edges, which were present in all four datasets, get the same directions in all datasets; therefore the direction method is robust. While our new edge-directing method still needs more empirical validation, we think that our present contribution opens up new possibilities in the analysis of the high-definition human connectome.


Introduction
High-angular resolution diffusion imaging (HARDI) datasets are widely used today for the construction of connectomes or braingraphs. The nodes (or vertices) of these graphs correspond to anatomically identified [1,2], small (1-1.5 cm 2 ) areas of the gray matter, and two nodes are connected by an undirected edge if neural fiber tracts are discovered, connecting the gray matter areas, corresponding to the two nodes [3,4].
The considerable advantage of the graph approach to the analysis of the human brain is that the graphs retain the relevant anatomical connection information from the MRI data, but they do not contain the-mostly irrelevant-data on the exact trajectories of the neural tracts in the white matter. One of the most frequently used and largest diffusion MRI datasets are the public releases of the Human Connectome Project (HCP) [5]. Numerous publications are applying HCP data in different contexts, e.g., [6][7][8][9][10][11][12].

Directing the connectome: The problem statement
High-angular resolution diffusion imaging methods cannot assign directions to the fiber tracts, and, consequently, to the edges of the connectomes mapped. Tractography methods can discover the trajectories, that is, the geometric curves of the fiber tracts in the white matter, but-presently-cannot determine the correct one from the two possible directions of any given fiber tract. Consequently, the connectomes of braingraphs that are prepared from the diffusion MR imaging data are undirected graphs.
There are enormous differences between the directed and undirected graphs. In undirected graphs, if we consider some walks, or paths, or signal propagations on the edges, all of these are allowed in both directions in all the edges. In directed graphs the edges carry unique directions, and the walks or signal propagations are allowed only in this assigned direction of the edges. Obviously, the difference is clear for everyone who has driven a car in the jungle of the one-way streets of a modern city.
Mathematically, there are also very significant differences: many graph-tasks are easy (or just trivial) for undirected graphs and hard for directed ones [13,14]. As an easy example, let us consider an undirected, connected simple graph on n vertices with n − 1 + ℓ edges, and suppose we want to determine the minimum number of edges that needs to be deleted in order to make the graph cycle-free. Without even looking at the graph, we know that ℓ edges need to be deleted, since any connected graphs with more than n − 1 edges on n vertices contain a cycle [13,14]. However, if we consider directed graphs and we want to get rid of all the directed cycles with the deletion of the minimum number edges, the problem becomes hard (more exactly, NP-hard, this is the famous "minimum feedback arc" problem [15]). As another example, the conditions for convergence of random walks on undirected graphs are well-understood [16][17][18], while they have numerous open problems on directed graphs [19].
Individual axons clearly have a well-defined direction of signal propagation from the soma to the end of the axon. Therefore, the directionality of the cerebral connections is a very relevant field of study.

Directing the connectome: fMRI-based methods
Functional MRI, on the other hand, is capable of mapping the temporal sequences of the activity of larger brain areas. Therefore, it seems to be possible to assign directions to these functional or effective connections between those larger areas of the gray matter. Several authors have attempted assigning directions to the edges of the connectomes using temporal sequences of activity changes, mostly on resting state fMRI or electro-physiology data, by applying a broad spectrum of statistical causality detection methods and models (e.g., [20][21][22][23][24][25][26][27][28][29][30][31]). The general approach of the articles listed can be described as follows: the temporal sequence of cerebral activity changes are translated to causality relations by statistical methods, and these causality relations are used to direct the connections between brain regions. Some works criticize the feasibility of directing the functional connectomes through temporal hemodynamic sequences. The article [32] questions the inference of edge direction based on temporal precedence in these studies since the hemodynamics varies between brain regions.
Since both the fMRI data and the non-invasive electro-physiological measurements have poor spatial resolution, usually the goal of these methods is not directing the edges of the HARDI-based, high-definition anatomical connectomes, but rather the functional connectomes on several dozens of vertices, corresponding to large areas of the gray matter, where haemodynamic activity changes were observed. By the best of our knowledge, no significant attempts were made to direct the edges of structural connectomes (i.e., connectomes measured by HARDI) with hundreds of vertices by these methods. One possible reason is due to the poor resolution of the fMRI: it is impossible to decide that if a causality is very probable between two cortical areas A and B in the direction from A to B, then which of the hundreds of the edges, running between A and B should be directed from A to B: theoretically, even one such connection may trigger the observed haemodynamic activity in B.

Consensus connectomes
It is unusual in graph theory that hundreds of different graphs on the very same set of vertices are encountered. When we are working on braingraphs constructed from HARDI data, this is the usual scenario: the cerebral areas of the different human subjects can be corresponded to the same reference brain map [1,2], and, consequently, the graph edges in the connectome can be compared between subjects, since the vertices or nodes are named in each subject according to the very same anatomical brain map. This process is, in fact, a non-trivial refinement of the assignment of anatomical nomenclature to cerebral areas in printed brain anatomy atlases that appeared in the last several hundred years (e.g., [33][34][35]).
Consensus Connectome Dynamics (CCD) is a surprising phenomenon that was observed after our construction of the Budapest Reference Connectome Server [36][37][38]. For the description of the CCD phenomenon, we need to clarify some functions of the Budapest Reference Connectome Server.
The webserver at the address http://connectome.pitgroup.org is capable of finding and visualizing connectome edges that are present in more than a pre-defined number of connectomes. More exactly, let n denote the number of all connectomes processed by the server (n = 6 in version 1.0, n = 96 in version 2.0 and n = 418 or n = 476 or n = 477-depending on other settingsin version 3.0). For any k: 0 � k � n, we say that the frequency of an edge is k if the edge is present in exactly k connectomes out of the n ones [39]. Clearly, if we have n connectomes then the frequency of any edge is between 0 and n: it is 0 if the edge does not appear in any connectome at all; and it is n if it appears in all the n connectomes. The k-consensus connectome contains all the edges with a frequency greater than or equal to k: that is, each edge in the k-consensus connectome is present in at least k individual connectomes.
The Budapest Reference Connectome Server is capable of generating k-consensus connectomes in CSV and GraphML formats for further processing and independent visualization, and on the website, users can also quickly view the resulting consensus connectomes. We note that subsequent to our work on Budapest Reference Connectome Server in 2015 and 2017 [36][37][38], the article [40] also covers consensus connectomes with applications in connectome error correction.

Consensus Connectome Dynamics
After the publication of the article describing the Budapest Reference Connectome Server [36], a surprising property of the server was recognized [38,41]: when we generate k-consensus connectomes for k = n, next for k = n − 1, k = n − 2, . . ., and last for k = 1, then, naturally, more and more edges appear in the graphs (since the inclusion condition is weakened in every step). The surprising phenomenon is that the new edges do not appear randomly in the graph, but new edges are mostly connected to the already existing edges, forming a growing, tree-like structure as it is shown on the video https://youtu.be/yxlyudPaVUE, [38,41]. This "dynamical" observation is quantified on Fig 2 of [38], and it is statically visualized on a very large component-tree at the address http://pitgroup.org/static/graphmlviewer/index.html?src=connectome_ dynamics_component_tree.graphml (for the detailed-and not entirely obvious-description of the component-tree we refer to [38]).
In [38] we called this phenomenon "Consensus Connectome Dynamics", and hypothesized that the particular order of the appearance of the graph edges describes the order of growth of the axonal fibers of these connections in the individual brain development.
If the newly formed edges-i.e., fibers of axons-almost always are connected to the already developed edges-i.e., fibers of axons-then this hypothesis satisfyingly explains the Consensus Connectome Dynamics phenomenon, visualized on Fig 1 or at the video https://youtu.be/ yxlyudPaVUE: • Connections, represented by the graph edges that are present in most connectomes were grown first (e.g., edges in the n-or n − 1-consensus connectomes).
• Next, the newer and newer edges were grown to be connected to the vertices of the older edges of the growing, tree-like structure, forming the k-consensus connectomes, for decreasing k values.
The newer and newer edges have gradually larger deviation, since small differences in the edge frequencies are added up in the growing structure: Clearly, if an edge e is not present in, say, 20% of the graphs, then for another edge, say f, which connects to e, will hold that the e, f edge-pair cannot be present in at least 20% of the graphs, and most probably, the e, f pair will be missing from much more than 20% of all graphs, since f will be missing in some graphs, containing e. Consequently, the frequencies of the newer edges will be less than the frequencies of older edges. Therefore, we believe that the visualization at https://youtu.be/yxlyudPaVUE, describes not only the edges with gradually smaller frequencies, but also the order of growth of the connections in the human brain.

Results and discussion
In the present work, we are assigning directions to some of the edges of braingraphs, depending on the appearance of that edge in the CCD.
The edge-directing method was first described in [38]; here we clarify the method, and make the 423 directed-edge connectomes publicly available at the site http://braingraph.org.
We have two requirements on the edge-directing method, based on Consensus Connectome Dynamics: • Feasibility: While our direct knowledge is very limited on the axonal development of the human brain in general and the directions of fiber tracts in the connectome in particular ( [42,43]), we need to build onto the available biological observations in developing our method; • Robustness: The directions, assigned to the edges of the connectomes need to be as independent as possible from the selected, specific sets of connectomes in the connectome sets in CCD trials. In other words, the CCD phenomenon needs the data of dozens or hundreds of connectomes; based on these connectomes and the CCD phenomenon, we assign directions to some of the edges of the individual connectomes. However, we want these directions assigned to the edges of the individual braingraphs as independent as possible from the particular choice of the several dozen or several hundred connectomes included in the CCD probe.
Here we suggest an (a) Feasible and (b) Robust edge-directing method.

Feasibility
A large part of the axonal connections in the brain of mammals is developed in the embryonic state or the first several weeks after birth [43]. Microscopic studies of developing rat brains show that a considerable number of axons of the cortex (from layer V cortical neurons), in the early brain development, grow in the direction of subcortical regions [42], strengthening our CCD observation that neurons of the cortex are connected later to the subcortical structures than the connections develop between those subcortical structures. Another important observation that forms a base of our method of directing the axonal fibers is the retrograde signaling of post-synaptic activation that stabilizes the newly formed synapses, and, consequently, prevents the retraction of the newly formed axon branches: work by numerous groups (e.g., [44][45][46][47][48][49][50]) witness this phenomenon.
Based on these observations, we make the following hypothesis: Neurons that are not connected to any other neuron yet, grow axons to reach other neurons.
(i). If the branch of the axon reaches a neuron that is not connected to other neurons yet, it retracts, since the post-synaptic activity of the un-networked neuron does not stabilize the connection (i.e., the synapse).
(ii). If the branch of the axon reaches an already "networked" neuron, that is, a neuron with connections to other neurons, then the post-synaptic activity of the "networked" neuron stabilizes the synapse and, consequently, the new connection.
Since the neuronal-level human braingraph with 80 billion neurons as vertices is unavailable today, we re-phrase the hypothesis above for the ROIs that form the 1015 vertices of our braingraphs, as follows: ROIs that are not connected to any other ROI yet, grow axonal fibers to reach other ROIs as observed in the CCD phenomenon.
(iii). If the fiber reaches an ROI that is not connected to other ROIs yet, the fiber retracts, since the post-synaptic activities of the un-networked neurons do not stabilize the neuronal connections.
(iv). If the fiber reaches an already "networked" ROI, that is, an ROI with connections to other ROIs, then the post-synaptic activities of the "networked" neurons stabilize the synapses, and the new connection will survive.
We stress that-instead of the mainly unknown order or temporal sequence of axonal development on the system level-we consider the phenomenon of the Consensus Connectome Dynamics, and we direct the edges of the connectomes according to the hypotheses (iii) and (iv) above.

Robustness
The robustness of the edge direction method is built into the edge direction algorithm, as it is detailed in the "Methods" section.

Methods
The braingraphs in this study were constructed from the data of the Human Connectome Project [5], using the workflow described in [36]. The undirected braingraphs have 1015 vertices each; the whole set of 423 braingraphs can be downloaded from the site http://braingraph. org/download-pit-group-connectomes/.
For verifying the robustness of the edge direction method, we first divided the braingraphs of the 423 subjects into four groups, the first contained 105, the others 106 graphs each. The graphs of each four groups were used-separately-to reproduce the CCD phenomenon, and based on the order of the appearance of the edges in the four different CCD probes, we assigned directions to some of the edges. Had the resulting directed graphs been radically different for the four groups, it would have indicated that our method is not a robust method of directing the edges of braingraphs.
As we detail below, it turned out that the four directed braingraphs were very similar to each other. After obtaining these four directed graphs, we used a majority-like approach to direct the edges of the consensus brain graph for the whole population, and, in a next step, the edges of the individual braingraphs.

Computing consensus braingraphs
For each group, for i = 1, 2, 3, 4, we defined the consensus braingraph G i as the multi-union of the individual connectomes. That is, for each possible graph edge we counted the subjects whose connectomes contained the specific edge. This number ranges from 0 to N i , where N i = 106, is the number of subjects in the given group. We defined the consensus braingraph as a simple graph containing all edges that were present in at least one of the subjects, where the edges are labeled (weighted) with the number of individual connectomes containing them.

Directing the edges with BFS
We defined edge directions on this brain graph G i as follows. For each k from N i down to 1, we defined G ik as the subgraph of G i containing all edges with frequency at least k. This means all the edges which were present in at least k subjects within the ith group. Suppose that G ik is already processed, and we want to process the graph G i(k−1) , which is an augmentation of G ik with precisely those edges which are present in exactly k − 1 individual graphs. Since G ik is already processed, our task is now to direct these newly added edges only. To achieve this, we launch a BFS (breadth-first search) in G i(k−1) , with G ik as the source. This means that, for each node in G i(k−1) , we calculate the distance of this node from G ik within the graph G i(k−1) . This distance is at least 0, is precisely 0 for the nodes of G ik , is finite for the nodes connected to G ik with a path of edges, and is infinite for the nodes of G i(k−1) not connected to G ik . Nodes with infinite distance do exist because sometimes isolated edges or new, small graph components appear in G i(k−1) .
After calculating the node distances, each edge in G i(k−1) will connect either two nodes of the same distance or two nodes with a distance differing by exactly 1. If an edge connects u with v, and the distance of v is exactly 1 less than the distance of u, then we direct that edge from u to v. If an edge connects two equidistant nodes, then we leave this edge undirected, and it will not acquire a direction in this consensus brain graph, not even in the subsequent steps.
This way, when we reach frequency k = 1 and have processed G i1 = G i , we obtain a partially directed version of G i . Some of its edges will become directed, and the remaining will be left undirected. As we do this for all four of the groups, we obtain four directed brain graphs on the same set of nodes.

Directed consensus connectome
We unify these directed connectomes into one big directed consensus connectome as follows. If an edge is directed in the same direction in at least 2 of the 4 consensus graphs, and is undirected or directed in the other direction in at most 1 of the graphs, then the direction of this edge is defined as the most common value. For example, if an edge is directed forward in 2 graphs, backward in 1 graph, and is absent from the remaining graph, then the edge will become forward-directed in the result graph. On the other hand, if that edge is directed forward in 2 graphs, backward in 1 graph, and is undirected in the remaining graph, then its direction is considered ambiguous, and this edge will remain undirected in the result graph.
By running this algorithm, we could direct 48324 of the 71783 edges of the consensus brain graph for the whole population. This means that we could assign a direction to 67% of all the edges which were present in at least one of the braingraphs. We think that 67% as a very significant portion of the edges, since the edges include even those sporadic connections which were present in only one brain graph and thus cannot be oriented reliably. If we count only those edges which were present in all 4 consensus connectomes (there were 31873 such edges), then we see that 26305 of them were directed the same way in all 4 groups. This means that 82% of these 31873 edges acquired the same orientation in all 4 groups.
This proves the robustness of our method. Additionally, these numbers also show that the CCD phenomenon is robust, too.

How can we direct the individual braingraphs?
In the previous subsection we have presented a CCD-based method to direct most of the edges of the consensus connectome in a robust way. The edges of the individual braingraphs can be directed by applying the directions of the edges of the (partially) directed consensus braingraph as follows: Let us consider an individual braingraph G and the directed consensus braingraph � G. If an edge e of G is directed in � G, then let us direct e in the same way in G, too. If edge e of G is undirected in � G, then e will remain to be undirected in G, too.
Supporting information S1 Table. Table gives the source C# code of the program, which computes the directions of the connectome edges. (PDF)