A software tool for the analysis of neuronal morphology data.

Anatomy plays a fundamental role in supporting and shaping nervous system activity. The remarkable progress of computer processing power within the last two decades has enabled the generation of electronic databases of complete three-dimensional (3D) dendritic and axonal morphology for neuroanatomical studies. Several laboratories are freely posting their reconstructions online after result publication v.gr. NeuroMorpho.Org (Nat Rev Neurosci7:318-324, 2006). These neuroanatomical archives represent a crucial resource to explore the relationship between structure and function in the brain (Front Neurosci6:49, 2012). However, such 'Cartesian' descriptions bear little intuitive information for neuroscientists. Here, we developed a simple prototype of a MATLAB-based software tool to quantitatively describe the 3D neuronal structures from public repositories. The program imports neuronal reconstructions and quantifies statistical distributions of basic morphological parameters such as branch length, tortuosity, branch's genealogy and bifurcation angles. Using these morphological distributions, our algorithm can generate a set of virtual neurons readily usable for network simulations.

Electronic databases of complete 3D dendrites constitute a valuable tool to explore the morphological structure of single neurons [1]. The data acquisition of those structures comprises a multi-step process from tissue collection and staining to the extraction of neuronal structural information via a variety of imaging techniques. To date, the majority of dendritic and axonal morphology reconstructions are based on bright-field microscopy mainly because of its broad compatibility with histological staining methods [2]. Digital tracing of neuronal morphology converts large amounts of imaging information into a simple and compact representation which can be easily visualized, quantified, archived, and shared [3], thus maximizing the opportunity to exploit the full potential of unrestricted morphometric analyses [1,4].
There are multiple ways to digitize neuronal morphology once it has been visualized by optical microscopy. One effective way to describe the treelike branching of axons and dendrites can be achieved by using a sequence of interconnected cylinders. In this 'vector' representation, each uniform segment in the arbor can be characterized by five values, consisting of the three Euclidean 'x', 'y' and 'z' coordinates, a diameter of its ending location, and the identity of the 'parent' segment from which each new segment originates. Thus, by definition, compartments are segments represented as cylinders with a given diameter and the coordinates of the extreme points. Branches are formed with one or more compartments between the soma, the bifurcations, and the tips. Bifurcations are defined as the points where a branch splits into 'daughter' branches.
This 'Cartesian' description of neuronal structures constitutes a complete mapping of dendritic morphology but bears little intuitive information. To extract quantitative measures of neuronal morphology, we developed a software tool written in MATLAB (MATLAB R2012a, MathWorks, Inc.) that reads these 3D dendritic reconstructions and computes morphological parameters from a large and representative set of neurons. This tool is freely available upon request.
Our software imports the cells, which are stored in a non-proprietary *.swc format including a header of comments where each line is preceded by a '#' sign. These lines describe the program used for neuron tracing (usually Neurolucida, MicroBrightField, Inc.), localization of the neuron (region, Field/Layer, etc.), type of cell, contributor, reference, soma area, shrinkage correction, number and date versions. Followed by these remarks, the *.swc file consists of a [n x 7] matrix which contains the following fields: Index (Column 1), a user defined flag denoting the specific part of the structure (cell body, apical dendrites, basilar dendrites and axon; Column 2), 3D coordinates (x, y, z, in μm; Columns 3-5), radius (r, in μm; Column 6), and parent index (Column 7). As two points connected by a straight line constitute a segment, then each neuronal reconstruction with n points has n-1 total segments, where n is the maximum row size of the matrix. The import file function deletes the header and stores the coordinates into a [n x 7] matrix in MATLAB's workspace for further analysis. In all cases the 'x', 'y' and 'z' values were corrected for shrinkage and lens medium refraction.
The identification of the branches was performed following a series of rules that determine whether two segments belong to the same branch. A segment is specified as a pair of points joined by a straight line and defined in the matrix as a pair of rows where the parent index of the first row (column 7) equals the index of the second row (column 1). Graphically this is the line that connects (x i ,y i ,z i ) with (x j ,y j ,z j ) where 'i' is the parent index and 'j' is the index. A branch is therefore conformed by one or more segments. Two connected segments belong to the same branch only if: i) the parent index of the second segment equals the index of the first segment, ii) both segments share the same structure flag stored in column 2 and iii) there is no other segment in the whole matrix that has the same parent index. If these restrictions applied, a sequential numerical label was assigned to the two segments in order to specify that all segments with the same name belonged to the same branch. At the end of this procedure, the set of rows that share the same firstname conform a branch whose first-name is a unique numerical label ( Figure 1A-C). Once each branch has its own first-name it is possible to track the path of its ancestors back to the soma ( Figure 1D). Consequently, the 'full-name' of a branch is computed as follows: the parent index in each row points to another row, which belongs to a branch with an already assigned first-name. Unless both rows belong to the same branch (i.e. having the same first-name), this name corresponds to the row's parent name. The row's parent-name (i.e. a full name) is stored and this process is repeated until the whole branching genealogy is generated. Thus, at the end of this process, each row has an associated full-name which can be traced backwards to its ancestors ( Figure 1D). Here again, the set of rows that share the same full-name conform a branch. Full-name length (number of ancestors+1) equals the path-length and this measure was further used to compute topographic measurements (see below). Our algorithm successfully identified branches and assigned a full-name to each of them. Plots assigning a pseudorandom color to each branch confirmed that all branches were correctly identified, including those that generated dichotomous bifurcations ( Figure 1E). Additionally, the plotter was able to represent vector's width equal to segment's diameter and/or assign four different colors depending on the specific cell structure (cell body, apical dendrites, basal dendrites or axon) according to the userdefined flag stored in the second column of the matrix (not illustrated).
Before being subjected to quantitative analysis, we first plotted the experimentally reconstructed neuronal structures by computer-assisted graphical rendering, zooming, panning, and rotating ( Figure 2). This allowed us to visualize the neurons as a series of black connected vectors (following index/parent index rules). Dendrites of hippocampal principal neurons displayed a polarized shape, as if they were enclosed in cones [5][6][7]. Furthermore, in hippocampal pyramidal cells ( Figure 2B-H), basal and apical trees invaded opposite hemi-spaces, whereas the hippocampal granule cell tree invaded the same hemispace ( Figure 2A). The dendritic arbors were clearly fanned out, each seemingly maintaining a preferred orientation throughout their length [6,7].
The analyzed cells from the Duke-Southampton archive exhibited a tendency to deviate repeatedly from the straight direction and return to the initial orientation after considerable meandering ('zigzags' towards the z-axis, Figure 2A, B). It is known that the acquisition and assembly procedures introduce morphological noise in any representation of digitized neurons [9][10][11], which makes it difficult to carry out a meaningful statistical analysis. We implemented a feature to perform z-coordinate smoothing aimed to diminish morphological parameter miscalculations due to such extensive amount of noise in the 'z' axis ( Figure 2B, C). Specifically, this function smoothes the 'z' data of each branch using a moving average filter. As the spatial distribution of points defining segments is not uniform, their distance projected in the 'xy' plane is used as predictor data for the z-smoothing. However, applying this procedure to each independent branch would result in local 'z-jumps' at bifurcations and spatial continuity between related branches is required. To solve this problem, the smoothing was performed concatenating two additional points located at the branch's endings: the first point given by an average of the final z-coordinate of the parent plus the initial z-coordinate of the sister. The latter represents the average of the initial z-coordinates of all daughters (if present). Aside from these issues, it is important to mention that the criteria for sampling data points for a morphological structure are subjective and to some extent arbitrary. Due to the complexity of dendritic morphology, the very same neurons mounted on microscope slides, and traced by different researchers or on different reconstruction Figure 1 Branch identification and genealogy. (A) According to a putative digital neuronal reconstruction, it is possible to isolate all segments that belong to a specific dendritic tree. (B) The branch identification procedure allows detecting all branches (represented with different colors), each one containing a series of connected segments (numbers of segments beside each branch). (C) First-name assignment provides an arbitrary but unique numerical label to each branch. (D) For a specific branch, full-name assignment 'routes' the names of the preceding branches backwards to the soma. A branch is identified by its full-name, which also provides information about its ancestors (i.e. how many ancestors it has and which are their first-names). (E) An isolated basal dendritic tree from CA1 cell pc4c_b (same as in Figure 2 , panel E). Different branches are rendered using different random-assigned colors. Calibration bars 100 μm.
systems, can result in considerably different digital files [10]. In this context, the issue of quality control for morphological data is extremely important and should be taken carefully into consideration in any morphological study before interpreting the results. In other words, digital files of dendritic morphology are rarely accurate representations of biological structures; they constitute only an approximation of the neuron. Nevertheless, if a digital data set is internally consistent (v.gr. correct indexing, no '0-length segments', etc.), then the mathematical problem of its quantitative representation is independent on the data quality.
Geographical studies have proposed various ways of ordering branches in a stream network. Contemporary interest in stream ordering derives largely from the work of Horton [12], who drew attention to a number of empirical regularities, usually now known as Horton's laws. His ordering scheme, however, is described as a variant of a method proposed later by Strahler [13]. 'Strahler ordering' assigns a number to each segment of the tree, which we refer to as the Strahler order number. Strahler ordering consists of the following steps: it starts with the ending branches of the rooted tree, that is, with those nodes, excluding the root, which are contiguous to only one edge. All edges that are contiguous to a branch are branches of order 1 ( Figure 3A). These are streams with no tributary. When two or more branches of order m come together, the third edge contiguous to that branch belongs to a branch of order [m+1]. When a branch of order m meets a branch of order n, where n exceeds m, the third edge at the branch is a continuation of the branch of order n ( Figure 3A). There are no arbitrary decisions in Strahler ordering, and it is purely topological. We tested the Strahler ordering function for correct indexing with a reconstructed neuron. Interestingly, all the pyramidal CA1 cells from the Gulyás repository had very low maximum Strahler indexes. This can be explained by the high number of terminal branches along their paths ( Figure 3B).
Because cells belonging to the Gulyás repository (Group 4; Figure 2D-G) showed no z-noise, we used them for further testing of morphological analyses. Accumulative morphological measurement consists of those parameters that require calculations made by using information from multiple segments. For example, the length of a branch is computed as the sum of the Euclidean lengths (in 3D) from all segments belonging to this branch. Tapering, measures changes in diameter along all the segments within a branch. This is reported as the slope of a linear regression model adjusted to all diameter values as a function of the accumulative branch's length. Additionally, a mean diameter/branch ± S.E.M is reported. Also, the distance metric (DM) tortuosity (a dimensionless number) provides a ratio between the actual path length of a meandering curve (i.e. along segments) and the linear distance between endpoints, whereas the sum of angles metric (SOAM) tortuosity integrates total curvature along a curve and normalizes it by path length [14,15], thereby handling tight coils better. Thus, for any point P k we defined the vectors T 1 =P k -P k-1 , T 2 =P k+1 -P k , and T 3 =P k+2 -P k+1 . The inplane angle at point P k (IP k ) and the torsional angle (TP k ) were given by the following equations, where TP k , IP k ∈ [0,π]: The total angle CP k at point P k is then The SOAM calculates the total tortuosity of the curve as Note that this expression normalizes the total curvature with respect to total curve length. This means that SOAM values can be compared between two branches of different length.
Bifurcation angles are taken as the angle between a daughter and its parent branch. To compute bifurcation angles between a daughter and its parent branch it is first necessary to calculate directional vectors at the beginning and at the end of each branch. The bifurcation angle between two branches does not only depend on its terminal segments. For this reason, when computing directional vectors it is necessary to consider several segments and then proceed with a 3D linear regression from where a normalized directional vector will be extracted. We used five segments per branch for directional vector computation (although we believe that the number of segments used should be justified in terms of their tortuosity).
Once accumulative morphological parameters are stored in the last row of each branch, it is possible to make a 'vertical compression' to produce a new matrix with each row corresponding to information from a branch with its respective accumulative morphological parameters.
Rall's pioneering development of the idealized equivalent cylinder model for passive dendrites [16] opened up the field of neuronal modeling. One prediction of his model is that the sum of the diameters of the daughter branches at each bifurcation, raised to the 3/2 power, must equal the parent branch diameter also raised to 3/2 power (the '3/2 power rule'). Therefore, our software  computes 'Rall's exponent' for each particular branch point by minimizing the following difference: All basic parameters were measured from digital files of traced neurons. Raw data for each parameter were extracted in the form of simple arrays, grouped for each cell class and characterized with histograms representing frequency distributions. Group results for three groups of granule cells (4A), apical (4B), and basilar (4C) dendritic trees of CA1 pyramidal cells are shown in Figure 4. Bin selection was made independently for each group according to Sturge's rule. However, selecting the same bin width for each histogram distribution allowed us to detect correlations between measurements for the same trees (such as depth×length and Euclidean distance×path distance; data not shown) as well as differences between morphological basic parameters of different dendritic trees ( Figure 5). For example, apical CA1 dendritic trees project at longer distances than basilar dendritic trees. This feature is reflected in the frequency distribution of the number of branches located at different 'Euclidean' path lengths (with the Figure 5 Formal Comparison between Apical and Basilar Dendritic Trees. Data from basal (A) and apical (B) dendrites from CA1 pyramidal cells were grouped and frequency distributions were computed for different morphological parameters using the same bin width. Apical CA1 dendritic trees project at longer distances than basilar dendritic trees. This is also evident when comparing the number of branches and bifurcations at different paths from the soma (two upper panels). Interestingly, branches from both, apical and basal dendritic trees show a similar shape for the branch length distributions. This suggests that both dendritic trees are constructed upon equally long branches (panels in the fourth row). Thus, a longer tree is a consequence of an increased number of total branches. starting point at the soma), and also when comparing the number of branches at different paths from the soma ( Figure 5). Notably, branches from both dendritic trees show similar lengths, which indicates that both dendritic trees are constructed upon equally long branches ( Figure 5). In other words, a longer dendritic tree (in spatial terms) is a consequence of the number of total branches employed to build the tree.
Elucidating the complex organization of the brain will require synthesis of information about neuron types, the spatial patterns of their dendritic and axonal arborizations, cell numbers and densities, as well as synapse number and location [17,18]. In the central nervous system, the shape of the dendritic arbor is related to the cell-type specificity and to the large number of synaptic inputs. The extent of dendritic arbors, at least in sensory neurons of the peripheral nervous system, physically defines their receptive fields [19], and axonal topology is known to affect synaptic output [20]. Discoveries that many dendrites conduct input signals actively, back-propagate action potentials, and integrate synaptic inputs by means of time-dependent nonlinear summation provide indisputable evidence that dendritic morphology is a key aspect of the neuronal machinery underlying signal processing and integration [21]. Dendritic structure contributes significantly to neuronal information processing [22,23] and computational models have shown that dendritic geometry can be responsible for producing an entire spectrum of firing patterns displayed across different cortical neuronal types [24], and also within a single class of hippocampal neurons [25]. The importance of dendrites for neuronal activity is evidenced by the influence of dendritic morphology on network connectivity [26] as it is constantly reshaped by the dynamic remodeling of both dendrites and axons, which is crucial in determining the pattern of synaptic formation among neurons [27].
Here, we developed a prototype of a MATLAB based software package to characterize neuronal dendrites on the basis of the statistical distributions of morphological parameters. From a merely morphological point of view and assuming that cells located on a specific site and under strict experimental conditions share similar morphological properties, the neuroanatomy of a cell class can be measured and compressed by quantifying statistical distributions of relevant morphological parameters. Such an approach is important for understanding the heterogeneity of the different neuronal groups, as well as for unveiling the relationship between neuronal structure and function. Hence, this tool can be applied for comparative anatomy, developmental neurobiology and medical diagnosis [4]. The resulting statistical descriptions of neuronal morphology can be further used to create an unlimited number of non-identical virtual neurons (data not shown). Virtual generation of axonal and dendritic arbors is useful to explore mechanisms of growth [28,29] and to construct biologically realistic neural networks [28,30].