Nonlinear functional network connectivity in resting functional magnetic resonance imaging data

Abstract In this work, we focus on explicitly nonlinear relationships in functional networks. We introduce a technique using normalized mutual information (NMI) that calculates the nonlinear relationship between different brain regions. We demonstrate our proposed approach using simulated data and then apply it to a dataset previously studied by Damaraju et al. This resting‐state fMRI data included 151 schizophrenia patients and 163 age‐ and gender‐matched healthy controls. We first decomposed these data using group independent component analysis (ICA) and yielded 47 functionally relevant intrinsic connectivity networks. Our analysis showed a modularized nonlinear relationship among brain functional networks that was particularly noticeable in the sensory and visual cortex. Interestingly, the modularity appears both meaningful and distinct from that revealed by the linear approach. Group analysis identified significant differences in explicitly nonlinear functional network connectivity (FNC) between schizophrenia patients and healthy controls, particularly in the visual cortex, with controls showing more nonlinearity (i.e., higher normalized mutual information between time courses with linear relationships removed) in most cases. Certain domains, including subcortical and auditory, showed relatively less nonlinear FNC (i.e., lower normalized mutual information), whereas links between the visual and other domains showed evidence of substantial nonlinear and modular properties. Overall, these results suggest that quantifying nonlinear dependencies of functional connectivity may provide a complementary and potentially important tool for studying brain function by exposing relevant variation that is typically ignored. Beyond this, we propose a method that captures both linear and nonlinear effects in a “boosted” approach. This method increases the sensitivity to group differences compared to the standard linear approach, at the cost of being unable to separate linear and nonlinear effects.


| INTRODUCTION
During the past few decades, functional magnetic resonance imaging (fMRI) has become one of the most widely approaches for understanding brain function. In this area, functional network connectivity (FNC) has been widely used to analyze the relationship among distinct brain's regions Bastos & Schoffelen, 2016;Friston, 2011;Sala-Llonch et al., 2015;van den Heuvel & Hulshoff Pol, 2010). Most functional connectivity studies, concentrate on linear relationships between time courses. Studies on network models for fMRI data suggest linear correlation as a successful statistical tool to identify the relation between fMRI time courses (Smith et al., 2011), which is also easy to calculate and interpret for positive and negative correlations in the field (e.g., the default mode network tends to be anticorrelated to other networks).
However, some research has suggested that some of the brain activity exhibits nonlinear dynamic behavior (Lahaye et al., 2003;Stam, 2005;Su et al., 2013;Wismüller et al., 2014). Other studies discuss the nonlinear effects of hemodynamic responses in fMRI data (Deneux & Faugeras, 2006;Miller et al., 2001;Obata et al., 2004), which, importantly, can also vary with time (and location) and changes from subject to subject (de Zwart et al., 2009). Considering even just these few examples of nonlinear effects, it is likely, even expected, that distinct brain areas might be nonlinearly related in a way that would be missed by conventional linear analysis.
In the current study, we were interested in evaluating the degree to which explicitly nonlinear relationships (i.e., after removing the linear relationships) exist among brain regions in a functional connectivity context and identifying significant dependencies. To our knowledge, there has been little work studying explicitly nonlinear relationships in functional connectivity.
Despite being widely used in the field, the linear correlation coefficient measures only linear relationships and ignores nonlinear contributions. Other higher-order statistical tools that can assess nonlinearity are not quite sensitive to the fMRI time course relationship (Smith et al., 2011). As a result, we proposed a new statistical tool to measure explicitly nonlinear dependencies. We focused on a normalized version of mutual information (MI), which is an information theoretic approach that has the advantage of being capable of measuring both linear and nonlinear dependencies. Early work evaluated MI to capture more general relationships (V. Calhoun et al., 2003).
More recently, alternative metrics for functional connectivity, including MI, have been explored (Mohanty et al., 2020;Sundaram et al., 2020;Tedeschi et al., 2005;Tsai et al., 1999;Wang et al., 2015;Zhang et al., 2018). However, to our knowledge, we are the first group to assess the explicitly nonlinear relationships among brain networks to evaluate their unique aspects relative to the linear relationships.
In summary, the contributions of this article are as follows. We developed an approach that explores the nonlinear dependencies among functional brain networks. Our method calculates the nonlinear dependency by employing the mutual information among the residual dependence after removing the linear relationship using a regression scheme. To assess whether the nonlinear relationships were potentially meaningful, we first focus on whether the resulting FNC matrices exhibit modular relationships consistent with functional integration. Second, we evaluated whether the nonlinear FNC shows meaningful group differences in a dataset consisting of resting fMRI data collected from schizophrenia patients and healthy controls.
Finally, we proposed a statistical method that provides an option to preserve the linear interpretation while also accounting for additional nonlinear dependency.

| Participants and preprocessing
In this work, we use the fBIRN dataset, which has been analyzed previously (Damaraju et al., 2014). The final curated dataset consisted of 163 healthy participants (mean age 36.9, 117 males; 46 females) and 151 age-and gender-matched patients with schizophrenia (mean age 37.8; 114 males, 37 females). Eyes-closed resting-state fMRI data were collected at seven sites across the United States (Keator et al., 2016). Informed consent was obtained from all subjects before scanning in accordance with the internal review boards of corresponding institutions. Imaging data of one site were captured on the General Electric Discovery MR750 scanner, and the rest of the six sites were collected on Siemens Tim Trio System. Resting-state fMRI scans were acquired using a standard gradient-echo echo-planar imaging paradigm: FOV of 220 Â 220 mm (64 Â 64 matrices), TR = 2 s, TE = 30 ms, FA = 770, 162 volumes, 32 sequential ascending axial slices of 4 mm thickness and 1 mm skip. Data were preprocessed by using several toolboxes such as AFNI, SPM, and GIFT. Rigid body motion correction using the INRIAlign (Friston, 2011) toolbox in SPM was applied to correct for head motion. To remove the outliers, the AFNI3s 3dDespike algorithm was performed. Then fMRI data were resampled to 3 mm 3 isotropic voxels. Then data were smoothed to 6 mm full width at half maximum (FWHM) using AFNI3s BlurToFWHM algorithm and each voxel time course was variance normalized. Subjects with larger movement were excluded from the analysis to mitigate motion effects during the curation process. For more details, refer to the study by Damaraju et al. (2014)).

| Postprocessing
The GIFT (http://trendscenter.org/software/gift) implementation of group-level spatial ICA was used to estimate 100 functional networks as ICA components. A subject-specific data reduction step was first used to reduce 162 time point data into 100 directions of maximal variability using principal component analysis. Next, the infomax approach (Bell & Sejnowski, 1995) was used to estimate 100 maximally independent components from the group PCA reduced matrix. The ICA algorithm was repeated multiple times for stability of estimation, and the most central run was selected as representative (Du et al., 2014). Finally, aggregated spatial maps were estimated as the modes of component clusters. Subject specific spatial maps (SMs) and time courses (TCs) were obtained using the spatiotemporal regression back reconstruction approach (Calhoun et al., 2001;Erhardt et al., 2011) implemented in the GIFT software.
To label the components, regions of peak activation clusters for each specific spatial map were obtained. After ICA processing, to acquire areas of peak activation clusters, one sample t-test maps are taken for each SM across all subjects and then thresholded; also, mean power spectra of the corresponding TCs were computed. The set of components as intrinsic connectivity networks (ICNs) was identified if their peak activation clusters fell within gray matter and showed less overlap with known vascular, susceptibility, ventricular, and edge regions corresponding to head motion. This resulted in 47 ICNs out of the 100 independent components. Running over 20 times ICASSO, the cluster stability/quality index for all except one ICNs was very high. After TCs were detrended and orthogonalized by considering estimated subject motion parameters, spikes were detected by AFNI3s 3dDespike algorithm and replaced by third-order spline fit values. For more detail see Allen et al. (2012) and Damaraju et al. (2014). After processing, the fBIRN dataset resulted in a matrix of 159 time points Â 47 components Â 314 subjects, including 163 Control and 151 SZ subjects.

| Mutual information approach
While linear correlation is the most widely used measure to describe dependence, it can completely miss nonlinear dependencies. An example to illustrate this shortfall is Anscombe's Quartet (Anscombe, 1973), which shows that four plots of various nonrandom data points have the same correlation coefficient despite their wildly different dependence structure. To measure the explicitly nonlinear relation between a pair of TCs, the approach applied in this research was to remove the linear correlation and calculate the residual dependence.
The Pearson product moment correlation coefficient, ρ, of time courses x and y is where S x and S y are, respectively, the sample standard deviations, and Cov x,y ð Þis the sample covariance between x and y. H y ð Þ). In this work, we used the latter as (Horibe, 1985) proved that it is a (normalized) similarity metric. The normalized MI (NMI) formula is In this work, our goal was to calculate only the nonlinear component of dependence. To do so, we measure the data's mutual information dependencies after removing the linear dependency. For a given time courses x and y, fitting a linear model y ¼ αx þ β gives the linear correlation between x and y. Here, y is the best linear estimation of y when x is given, the slope is denoted by α, and β is the y-intercept.
Next, we cancel the linear effect by calculating z ¼ y À y. The nonlinear dependency of x and z is the same as x and y. Next, we can use NMI x, z ð Þ to evaluate the nonlinear dependency of x and y. To assure symmetricity, that is, NMI x, y ð Þ¼NMI y, x ð Þ, we took the average of the results when switching x and y.

| Simulated experiment
We applied the proposed method to simulated data to illustrate their use. In this experiment, we started with a vector say x of size 1000 Â 1 where its components are from a random uniform distribution on [0 1]. Next, we formed three vectors y 1 , y 2 , and y 3 , such that each one has a particular relationship with x. Three different types of relationships are as follows: Case I, vector y 1 has a purely linear relationship with x. Case II, we defined y 2 to have a quadratic relationship and no linear correlation with x. That is x and y 2 are only nonlinearly related. Case III, vector y 3 has a combination of linear and nonlinear dependencies with x. We also added zero-mean Gaussian noise to y 1 , y 2 , and y 3 ( Figure 1).
We measured the relationship of x, y 1 ð Þ, x, y 2 ð Þ, and x, y 3 ð Þ using both Pearson correlation and normalized mutual information approaches. Pearson correlation takes value from À1 to 1. Briefly, À1 refers to a perfectly linear negative correlation, and 1 shows a perfectly linear positive correlation. The normalized mutual information we use in this work is in the range of [0,1]. The NMI = 0 indicates no dependency, and as two distributions increase their dependence, the NMI value rises to a maximum of 1. Before computing correlation and NMI, we implemented the procedure explained earlier to remove the linear correlation from y 1 ,y 2 ,and y 3 . Next, we calculated the Pearson correlation and normalized mutual information for each pair, as shown in Table 1.

| Quantifying nonlinear connectivity in fMRI data
For each subject, there are 47 ICA time courses of length 159. For each pair of time courses x and y, we compute the traditional FNC (i.e., the linear correlation between all pairs x and y). Next, the mean FNC matrix is calculated overall 314 subjects ( Figure 2a).
We then fit a linear model to estimate the linear correlation between x and y. After this, we remove the linear effect to study the remaining dependencies by updating y as y ¼ y À y. Next, we calculate the residual dependencies among functional network components in fMRI data via NMI. This produces a matrix of 47 by 47 for each subject in which the value in x, y ð Þ entry shows the nonlinear dependencies calculated by the NMI method for x and y. After that, we computed the average overall subjects (Figure 2b). Then, to evaluate whether the NMI showed significant variation across the brain, we performed a t-test comparing the mean of each cell to that of the minimum to identify cells where the average in given cell is significantly greater than the minimum average cell. In addition, we used the random matrix analysis method (Vergara et al., 2018) to examine the modularity of the resulting explicitly nonlinear dependency matrix.
Based on the result of linear FNC and explicitly nonlinear FNC in We also compare the nonlinear dependencies between schizophrenia patients and controls. The linear effect is canceled within each group, and the average NMI is calculated over all subjects. Then, we implemented a t-test to identify significant group differences. For false discovery rate (FDR) correction, all p values were adjusted by the F I G U R E 1 Three simulation cases for linear and nonlinear relationships between two vectors. Vector x has its components randomly derived from a uniform distribution on [0 1]. From left to right, we have Case I, Case II, and Case III such that in Case I, y 1 ¼ 2x þ ε (linear relationship between x and y 1 ). In Case II, we have y 2 ¼ 5 x À 0:5 ð Þ 2 þ ε (nonlinear relationship between x and y 2 ) and for Case III, y 3 ¼ 5 x À 0:5 ð Þ 2 þ 2x þ ε (combination of linear and nonlinear relationships between x and y 3 ). Noise ε is a Gaussian distribution with a mean of zero T A B L E 1 Simulation of three cases, including Case I: Linear correlation, Case II: Nonlinear relation and, Case III: Linear and nonlinear relationship between two vectors Note: The contribution of two vectors in each case was measured by Pearson correlation (Corr) and normalized mutual information (NMI). In this table, Corr 1 and NMI 1 show the correlation between the original data, and Corr 2 and NMI 2 show the correlation after removing the linear relationship. As expected, the correlation is effectively zero after the removal of linear effects. Results show that correlation completely misses the residual nonlinear dependencies, and that the NMI approach is able to effectively capture the nonlinear relationships when they exist.
Benjamini-Hochberg correction method and thresholded at a corrected p < .05.

| Boosted approach
While we emphasize the unique information contained in the nonlinearities, future studies may wish to leverage both linear and nonlinear information. Pearson correlation is widely used in functional connectivity studies because of the simple calculation and being more sensitive to capture the dominant linear dependencies among the fMRI time courses. In addition, the sign of linear correlation reveals meaningful information about brain function. We propose a method that provides an option to preserve the linear interpretation while also accounting for additional nonlinear dependency. This boosted approach is a combination of Pearson correlation and modified mutual information for quantifying nonlinear dependencies as described in Section 2.3. We define the boosted method as F I G U R E 3 Linear and explicitly nonlinear relationship of two pairs of ICN components. The plots on the left are related to pair (23, 38) for one subject, and the plots on the right belong to the same subject for pair (2, 3). In panel (a), the nonlinear pair moves from a low to high coupling, where the high coupling links a higher frequency to a lower frequency pattern (right side of the figure on the left). Panel (b) displays the frequency spectrum. It shows when the two-time courses exhibit dependencies but have different frequency profiles; linear correlations are not well capturing the relationships. This highlights one interesting aspect but note it may not be generally the case as these patterns vary across pairs and subjects.

| Joint distributions
To visualize the identified nonlinear relationships further, we selected the five component pairs with the most significant p values in the ttest for group differences in the nonlinear dependence for HC-SZ.
Then we constructed the difference in the joint distributions for each pair of time courses, comparing patients and controls.

| On fMRI data
We measured the linear correlation among 47 ICN's time courses estimated from the resting fMRI data-the result in Figure 2. Figure 2a is the average FNC across 314 subjects.
The average of nonlinear dependencies across all subjects is calculated using our proposed NMI method for the same components ( Figure 2b). The assessment of randomness confirms that explicitly nonlinear FNC indicates highly significant modularity relative to a random matrix. Then, a t-test for one sample is applied to identify pairs with a significant difference in average value from the minimum average. After FDR thresholding (p < .05), most pairs are detected to be significantly larger than the minimum mean. This shows us the degree to which NMI differences are significantly different across the FNC matrix and demonstrates modular nonlinear dependencies between visual (VIS) and somatomotor (SM) components with other components. Interestingly, a low nonlinear dependency rate is observed despite a high linear correlation between subcortical (SC) and auditory (AUD). Next, the variation in nonlinear dependencies between healthy controls and schizophrenia patients is evaluated using our NMI method and compared using a two-sample T-test. In Figure 4a

| Boosted approach
Dependencies among components in healthy controls and schizophrenia patients are assessed with three methods: (1) Pearson correlation, in which the emphasis is on only linear correlation, (2) modified mutual information, as described in Section 2.3, quantifies only nonlinear dependencies, and (3) the boosted approach is explained in Section 2.6, is boosting the linear correlation by capturing nonlinear dependencies.
Next, in each method, the t-test is applied to compare the differences between two groups. Adjusted p value by FDR is thresholded (p < .05). The number of pairs with significant differences for the Pearson correlation method is 530, the modified mutual information method is 17, and the boosted method is 537. There are five pairs for which their linear correlation is not significantly different, but their nonlinear dependencies are significant. Note that linear and explicitly nonlinear analyses are complementary to one another. The explicitly nonlinear NMI approach gives 17 significant pairs regardless of whether their linear dependency is significant or not. That is, this information and differences in nonlinear changes are completely ignored by the linear correlation approach and importantly, the brain wide pattern of nonlinear group differences is highly modular and structured. The boosted method provides an option to capture both the linear and nonlinear components of dependencies. Results also show the boosted approach identifies relationship that is not captured by either the linear or full NMI approaches.

| Joint distributions
We were also interested in visualizing the relationship among time course pairs, which exhibited nonlinear differences between patients and controls, Figure 5  correlation only, and as a result, they miss the nonlinear contributions.
We establish an approach to evaluate the explicitly nonlinear dependencies between distinct brain regions by first removing the linear dependencies. We first demonstrated our approach works as expected on simulated data (Figure 1). Following the nonlinear dependencies among 47 time courses on 314 subjects are assessed ( Figure 2). A similar approach was applied to estimate how differently in average distinct regions of a schizophrenia patient's brain contributes nonlinearly to the context of functional connectivity (Figure 4).
Also, the joint distribution of five pairs with the largest group differences in the nonlinear dependencies in HC-SZ is studied ( Figure 5).
There are a number of possible causes of nonlinear dependencies, including (1) nonlinear hemodynamic effects. Studies on the relationship between neuronal activity, oxygen metabolism, and hemodynamic responses have shown the link between neuronal activity and hemodynamic response magnitude exhibits both linear and nonlinear effects in task data (Friston et al., 2000;Sheth et al., 2004;Wan et al., 2006). Nonlinearity can be induced via hemodynamic response changes with time. The hemodynamics can also vary between subjects and groups. Other results suggest a strongly nonlinear relationship between electrophysiological measures of neuronal activity and the hemodynamic response (Devor et al., 2003;Sheth et al., 2004).
(2) Differences in blood flow, blood oxygenation, and blood volume both within subjects and between groups. Experiments indicate that acquired vascular space occupancy (VASO), arterial spin labeling (ASL) perfusion, and BOLD signals respond nonlinearly to stimulus duration (de Zwart et al., 2009;Gu et al., 2005). (3)  We also found significant differences in the nonlinear relationships among the patients and controls. Nonlinear FNC pairwise comparisons between SZ and HC are shown in Figure 5a. In most cases, the controls are showing higher nonlinear dependencies relative to patients, mostly linked to the visual domain. There are a significant differences in nonlinear relation within visual (VIS) components as well as between VIS components and to other components such as auditory (AUD), somatomotor (SM), cognitive control (CC), and default-mode (DM) in SZ patient and HC. We observe that most of the patient/control differences involve visual and auditory components. This is intriguing given existing evidence suggesting some schizophrenia symptoms may be linked to the visual system (Gong et al., 2020;Johnston et al., 2005;Onitsuka et al., 2007). Having said that, visual symptoms such as visual hallucinations are rather uncommon in SZ and rarer than auditory and tactile abnormalities (van de Ven et al., 2017). In addition, some studies suggest inborn blindness may be shielded against the development of schizophrenia, characterized by inevitably noisy perceptual input that causes false inferences.
These findings argue that when individuals cannot see from birth, they depend more on the other senses. Thus, the resulting model of the world is more resistant to false interpretations (Landgraf & Osterheider, 2013;Leivada & Boeckx, 2014;Morgan et al., 2018;Pollak & Corlett, 2019;Riscalla, 1980).
While the results presented show the potential utility of focusing on explicitly nonlinear dependencies in fMRI data, there is still much work to be done. Future work should focus on carefully evaluating the possible sources of the nonlinear relationships. Quantitative fMRI studies could be used to isolate nonlinearities in blood oxygenation, volume, and flow. In addition, high field layer-specific fMRI studies could be used to evaluate nonlinearities in input vs. output layers. The contribution of various physiological variables (e.g., respiration, CO 2 , heart rate, and motion) could also be evaluated in future work.
In addition, our results provide evidence suggesting there are meaningful and significant nonlinear dependencies among fMRI time courses. We have shown evidence suggesting meaningful (modularized and group different) super-linear effects in FNC, which primarily implicates the visual cortex as disrupted in schizophrenia. We present two approaches, focusing on the explicitly linear effects or a boosted method that captures both linear and nonlinear effects within one metric. Future work should study the information contained in the nonlinear relationships. It could be studied with faster acquisitions, linked to multimodal imaging such as concurrent EEG data, and replicate the results we show in this work.