Divergent topological architecture of the default mode network as a pretreatment predictor of early antidepressant response in major depressive disorder

Identifying a robust pretreatment neuroimaging marker would be helpful for the selection of an optimal therapy for major depressive disorder (MDD). We recruited 82 MDD patients [n = 42 treatment-responsive depression (RD) and n = 40 non-responding depression (NRD)] and 50 healthy controls (HC) for this study. Based on the thresholded partial correlation matrices of 58 specific brain regions, a graph theory approach was applied to analyse the topological properties. When compared to HC, both RD and NRD patients exhibited a lower nodal degree (Dnodal) in the left anterior cingulate gyrus; as for RD, the Dnodal of the left superior medial orbitofrontal gyrus was significantly reduced, but the right inferior orbitofrontal gyrus was increased (all P < 0.017, FDR corrected). Moreover, the nodal degree in the right dorsolateral superior frontal cortex (SFGdor) was significantly lower in RD than in NRD. Receiver operating characteristic curve analysis demonstrated that the λ and nodal degree in the right SFGdor exhibited a good ability to distinguish nonresponding patients from responsive patients, which could serve as a specific maker to predict an early response to antidepressants. The disrupted topological configurations in the present study extend the understanding of pretreatment neuroimaging predictors for antidepressant medication.


Participants
The MDD participants met the following inclusion and exclusion criteria: (1) they met the major depressive disorder in DSM-IV criteria at the time point of enrollment; (2) they were in their first depressive episode and the age of onset was over 18 years; (3) 24 items Hamilton Depression Rating Scale (HAMD) were greater than 20; (4) absence of another major psychiatric illness, including severe anxiety, substance abuse or dependence; (5) absence of primary neurological illness, including dementia or stroke; (6) absence of medical illness impairing cognitive function; (7) no history of receiving electroconvulsive therapy; (8) no gross structural abnormalities on T1-weight images, and no major gross major white matter changes such as infarction or other vascular lesions T2-weighted MRI; (9) have no psychotic symptoms (i.e. hallucination/bizarre delusions/thought broadcasting). For healthy controls (HC) that recruited from local community, they must meet the abovementioned criteria (4) -(9) and have no history of any affective disorders including MDD.

Imaging Acquisition
A gradient-recalled echo-planar imaging (GRE-EPI) pulse sequence was set up to acquire restingstate images. The acquisition parameters of rs-fMRI were as follows: repetition time = 2000 ms; echo time = 25 ms; flip angle = 90°; acquisition matrix = 64 × 64; field of view = 240 × 240 mm 2 ; thickness = 3.0 mm; gap = 0 mm; 36 axial slices, and 3.75 × 3.75 mm 2 in-plane resolution parallel to the anterior commissure-posterior commissure line. High-resolution T1-weighted axial images covering the whole brain were acquired utilizing a 3-dimensional inversion recovery prepared fast spoiled gradient echo (SPGR) sequence presented as follows: repetition time = 1900ms; echo time = 2.48ms; flip angle = 9 • ; acquisition matrix = 256 × 192; field of view = 250mm × 250mm; thickness =1.0mm; gap = 0mm. The MRI scans were processed before the patients get start to receive antidepressants treatment. After the removed of head motion (i.e., exceeding 1.5 mm in transition or 1.5 • in rotation) or poor quality of image (i.e., ghost intensity), the MRI data from 82 MDD patients and 50 HC qualify for further calculate.

Functional Image Preprocessing
Functional images were preprocessed utilizing the Data Processing Assistant for Resting-State Function MRI (DPARSF 2.3 http://www.restfmri.net/forum/dparsf) toolkit, which synthesizes procedures based on the Resting-State Functional MR imaging toolkit (REST; http://www.restfmri.net), and statistical parametric mapping software package (SPM8 http://www.fil.ion.ucl.ac.uk/spm). The first ten time points were discounted in order to ensure stable-state longitudinal magnetization and adaptation to inherent scanner noise. The remaining 230 rs-fMRI images were sequentially performed according following steps: (1) slice timed with the 35th slice as reference slice; corrected for temporal differences and head motion correction (participants with head motion of more than 1.5 mm of maximum displacement in any direction (x, y, or z) or 1.5 degrees of angular motion were excluded from the present study); (2) coregistered T1 to functional image and then reoriented; (3) for spatial normalization, T1-weighted anatomic images were segmented into white matter, gray matter and cerebrospinal fluid, and then normalized to the Montreal Neurological Institute space by using a 12-parameter nonlinear transformation. The above transformation parameters were applied to the functional images and then the functional images with isotropic voxels of 3 mm resampled; (3) spatial smoothing undertaken with a 6 mm full-width at half-maximum isotropic Gaussian kernel; (4) the linear trend within each voxel's time series 4 removed; (5) temporal bandpass (0.01-0.08Hz) to minimize low-frequency drift and high-frequency noise filtered; (6) the nuisance signals (global mean signal, white matter, cerebrospinal fluid signals, head-motion parameters calculated by rigid body 6 correction) and spike regressors were regressed out.

Network Construction
A whole-brain parcellation scheme was recently created based on a large meta-analysis of fMRI studies combined with whole brain functional connectivity mapping 1 5 The small-world parameters of a network (clustering coefficient Cp, and characteristic path length Lp) were originally proposed by Watts and Strogatz 4 . Briefly, the Cp of a network is the average of the clustering coefficients over all nodes, where the clustering coefficient Ci of a node is defined as the ratio of the number of existing connections among the node's neighbors and all their possible connections. Cp quantifies the local interconnectivity of a network. Lp of a network is the shortest path length (numbers of edges) required to transfer from one node to another averaged over all pairs of nodes. Lp indicates the overall routing efficiency of a network. To estimate the small-world properties, we scaled Cp and Lp derived from the brain networks with the mean Cp rand and Lp rand of 100 random networks (i.e., γ = Cp/ Cp rand and λ = Lp/ Lp rand ) that preserved the same number of nodes, edges and degree distributions as the real networks 5 . A small-world network should fulfill the conditions of γ > 1 and λ ≈ 1 4 , and then the small-worldness scalar σ = γ / λ will be higher than 1.

Network Efficiency
The global efficiency measures the ability of parallel information transmission over the network 6 . For a network G with N nodes and K edges, the global efficiency of G can be computed as: where L ij is the shortest path length between node i and node j in G.
The local efficiency measures the fault tolerance of the network, indicating the capability of information exchange for each subgraph when the index node is eliminated. The local efficiency of G is measured as: where Gi denotes the subgraph composed of the nearest neighbors of node i. 6

Regional Nodal Characteristics
To evaluate the roles of brain regions (or nodes) in brain networks, we computed the regional efficiency E nodal (i) 2 . Nodal efficiency measures the information propagation ability of a node with the rest of nodes in the network. The nodal efficiency of node i is computed as: where L ij is the shortest path length between node i and node j in G.
Finally, the nodal degree of a node i is defined as: where eij is the (i,j) element in the formerly generated binary, undirected network.

Network metrics
To test the null hypothesis that the observed group differences could occur by chance, we randomly reallocated each subject to one of the two groups and recomputed the mean differences between the two randomized groups. The randomization procedure was repeated 10,000 times, and a randomized null distribution based on between-group differences in each metric was created. Then the 95% percentile point of the distribution was used as the critical value for two-tail test of the null hypothesis. This permutation test procedure was repeated at the sparsity of 6% ≤ S ≤ 34%.
Additionally, the same permutation procedure was used to analyse the AUC of network measures between groups. Furthermore, before the permutation tests, multiple linear regression analyses were applied to regress the confounding effects of age, gender and years of education for each network metric. Supplementary eFigure 1. A flowchart for the construction of functional brain networks.
Note: For all subjects, a correlation matrix was acquired for each subject by calculating interregional Pearson's correlation coefficient of mean time series among the 58 DMN regions and can generate the non-binarized topological matrices; then, these correlation matrices were further converted into binarized matrices by applying a thresholding procedure; finally, the obtained binary matrices could be finally represented as networks or graphs that were composed of brain nodes and edges.