Surface‐Based Connectivity Integration: An atlas‐free approach to jointly study functional and structural connectivity

Abstract There has been increasing interest in jointly studying structural connectivity (SC) and functional connectivity (FC) derived from diffusion and functional MRI. Previous connectome integration studies almost exclusively required predefined atlases. However, there are many potential atlases to choose from and this choice heavily affects all subsequent analyses. To avoid such an arbitrary choice, we propose a novel atlas‐free approach, named Surface‐Based Connectivity Integration (SBCI), to more accurately study the relationships between SC and FC throughout the intra‐cortical gray matter. SBCI represents both SC and FC in a continuous manner on the white surface, avoiding the need for prespecified atlases. The continuous SC is represented as a probability density function and is smoothed for better facilitation of its integration with FC. To infer the relationship between SC and FC, three novel sets of SC‐FC coupling (SFC) measures are derived. Using data from the Human Connectome Project, we introduce the high‐quality SFC measures produced by SBCI and demonstrate the use of these measures to study sex differences in a cohort of young adults. Compared with atlas‐based methods, this atlas‐free framework produces more reproducible SFC features and shows greater predictive power in distinguishing biological sex. This opens promising new directions for all connectomics studies.

brain connectivities are particularly prominent. While dMRI measures the restriction of isotropic diffusive water movement (Basser, Mattiello, & LeBihan, 1994) and can be used to infer structural connectivity (SC) (Sporns, 2013a), fMRI measures the blood oxygen level dependent (BOLD) signal, deemed as a proxy for neurovascular coupling due to large-scale neural activation (Ogawa, Lee, Kay, & Tank, 1990) and can be used to infer functional connectivity (FC) (Sporns, 2013b).
Both SC and FC play important roles in understanding the human brain (Sporns, 2013b;Zimmermann, Griffiths, & McIntosh, 2018). As a result of the anatomy of white matter (WM), dMRI can be used to estimate the restricted diffusive patterns of water throughout the brain by applying magnetic gradients of multiple strengths in many directions (Li, Shi, & Toga, 2016). Diffusion signals are then reconstructed per voxel and represented as fiber orientation distribution functions (fODFs), which can then be fitted to tractography algorithms to estimate the locations of WM connections (Basser et al., 1994;Bastiani et al., 2012;Descoteaux et al., 2009;Girard, Whittingstall, Deriche, & Descoteaux, 2014;Tournier, Calamante, & Connelly, 2012). SC aims to measure the extent to which brain regions are connected by WM fiber bundles, where traditionally regions are obtained based on a parcellated atlas (Zhengwu Zhang, Allen, Zhu, & Dunson, 2019;Zhengwu Zhang et al., 2018). In the context of image analysis, we are limited to inferring SC from the streamlines connecting any pair of brain regions constructed by tractography algorithms (Girard et al., 2014;Maier-Hein et al., 2017;Thomas et al., 2014) due to MRI acquisition limitations, such as low spatial resolution and signal-to-noise ratio. On the other hand, FC is obtained from calculating the correlation of BOLD signals between different brain regions. Differences in magnetic susceptibilities between oxygenated and deoxygenated blood give rise to the BOLD signal after neural stimulation has occurred in a region (Ogawa et al., 1990).
BOLD imaging produces a time series of these susceptibility changes and is thought to represent the overall blood oxygenation exchange in each voxel as a function of time.
Although most existing connectivity studies explore SC or FC independently, there is increasing interest in exploring SC and FC together, referred to as SC and FC integration in this paper. Much of this previous integration workfalls roughly into three broad classes: prediction, modeling, and fusion. As one of the earliest papers studying the relationships between SC and FC, Honey et al. (2009) demonstrated that brain regions that are directly structurally connected have higher FC than those regions that are not directly connected, leading to a series of studies that attempted to predict an individual's FC directly from SC (Chamberland et al., 2017;Deligianni et al., 2013;Goñi et al., 2014;Higgins, Kundu, & Guo, 2018;Honey, Thivierge, & Sporns, 2010;Li, Shafipour, Shafipour, & Zhang, 2019;Messé, Rudrauf, Giron, & Marrelec, 2015). Modeling FC with neural spiking models by incorporating known SC as prior information has also drawn some attention recently. By assuming that SC and FC are highly correlated, it may be possible to derive FC patterns directly from neural spiking equations and SC to simulate hemodynamic response functions (HRFs) without collecting any functional data (Bassett, Zurn, & Gold, 2018;Iyer et al., 2013;Nakagawa, Jirsa, Spiegler, McIntosh, & Deco, 2013;P. Wang et al., 2019). Finally, connectivity fusion uses measurements from both dMRI and fMRI to derive new information about the brain (Bassett et al., 2018;Zhu et al., 2014). For example, Fan et al. (2016) built a novel brain atlas, called the Brainnetome atlas, using information from SC and FC, incorporating both structural and functional information across the brain.
These previous studies of SC and FC integration used different image resolutions, tractography algorithms, and brain parcellations to reconstruct the connectomes. Some found that the strength of FC is related to the anatomical pathways (SC) (F. D. Bowman, Zhang, Derado, & Chen, 2012;Goñi et al., 2014;Hermundstad et al., 2013;Van Den Heuvel, Mandl, Kahn, & Hulshoff Pol, 2009), while others concluded that the correlation between SC and FC is poor (Buckner, Krienen, & Yeo, 2013;Chamberland et al., 2017;Ghumman, Fortin, Noel-Lamy, Cunnane, & Whittingstall, 2016). These heterogeneous findings lead to several fundamental questions we must consider when studying SC and FC jointly: 1. Which structure in the brain (e.g., gray matter, white matter, the white or pial surface) is the best place for exploring the relationships between SC and FC? 2. Which parcellation is most suitable for SC and FC integration?
3. How should intra-cortical coupling between SC and FC be defined and how does such coupling vary across different populations?
Most existing studies use gray matter (GM) regions of interest (ROIs) as network nodes in deriving SC and FC. However, due to the limitations in dMRI acquisition (Reveley et al., 2015) and streamline reconstruction (Girard et al., 2014;Maier-Hein et al., 2017;Thomas et al., 2014), using GM ROIs to build SC can result in biased SC estimation. For example, streamlines can stop prematurely in the WM or near the GM-WM interface, superficial WM tracts can impede the construction of longer streamlines (Reveley et al., 2015), and the precision of streamline reconstruction is reduced due to partial volume effects (PVEs) in lower spatial resolution acquisitions (Tournier, Mori, & Leemans, 2011). Similarly, using GM ROIs as nodes in FC often requires volumetric smoothing and PVE correction, which reduces the spatial localization of BOLD signals and thus provides inaccurate FC estimation (Coalson, Van Essen, & Glasser, 2018).
Traditional construction of SC and FC often requires predefined parcellations/atlases for two primary reasons: (a) dimensionality reduction and (b) more straightforward interpretations of localized physical processes (Glasser, Coalson, et al., 2016). However, choosing an optimal brain parcellation for SC and FC integration is challenging due to the complicated nature of the brain. Parcellations derived from one modality (Gordon et al., 2014;Thomas Yeo et al., 2011) may not be suitable for studying others. As such, we currently do not know the most suitable parcellation to study SC and FC jointly. Finally, the correlation or coupling strength between SC and FC has the potential to unlock some key insights on how brain structure collaborates with function. For example, it is natural to expect that such a correlation can be spatially different across brain regions and populations (Nakagawa et al., 2013). However, few studies have evaluated the relationship between SC and FC at high resolution, nor how this relationship may differ between different groups of subjects (e.g. males vs. females).
With a comprehensive re-thinking of these problems, we propose a novel atlas-free approach, named Surface-Based Connectivity Integration (SBCI), to facilitate the integrative analysis of SC and FC. In SBCI, we propose to use the white surface (the interface of white matter and gray matter) to build both SC and FC and represent them in a continuous fashion without the need of any pre-specified brain parcellation. SBCI also gives three novel SC-FC coupling strength measures to study the relationship between SC and FC across the brain's cortical surface. Figure 1 shows a systematic overview of the SBCI framework. Compared with existing work, SBCI has the following unique features: • SBCI maps both SC and FC to the white surface. The white surface is the interface between cortical GM and WM. Since diffusion signals are mostly in the WM regions and functional signals are more present in the GM regions, we posit that the white surface is the best place to study the integration of SC and FC. While many FC studies are based on the cortical surface (Coalson et al., 2018;Glasser, Coalson, et al., 2016), studies of the SC on this surface are still limited. Extending tractography reliably to the white surface comes with many challenges (Reveley et al., 2015). A novel algorithm called Surface-Enhanced given parcellation, and discrete coupling with a given parcellation.
Details are presented in Section 2.5.
In the following sections, we describe the details of the SBCI pipeline, experiments used to validate it, and an application of the SFC features as imaging markers to distinguish biological sex.

| Datasets
The HCP is the current gold standard for human brain connectivity mapping. In order to develop and validate the SBCI pipeline, we used the HCP Young Adult (HCPYA) and HCP Test-Retest (HCPTR) data.
Data were downloaded from the ConnectomeDB website.
The data used in this paper include preprocessed T1-weighted (T1w) and dMRI images and unprocessed resting-state fMRI images from the HCPYA and HCPTR datasets. Full imaging acquisition information and minimal fMRI and dMRI image preprocessing steps are documented in Glasser et al. (2013). Briefly, all imaging was conducted on the 3T Siemens Connectom scanner (Erlangen, Germany). Highresolution T1w anatomical images were acquired with the 3D MPRAGE (magnetization prepared rapid gradient echo) sequence with a slice acceleration factor of 2 using 0.7 mm isotropic resolution. Diffusion imaging was performed using a 2D spin-echo EPI (echo planar imaging) sequence with approximately 90 diffusion directions at three nonzero b-values (1,000, 2,000, and 3,000 s/mm 2 ) each and 6 b0 ref- low-resolution MRI data (details are presented in Section S1; Data S1) and K. D. Murray et al. (2020)).

| Image preparation and preprocessing
Diffusion image preprocessing steps including brain extraction, susceptibility-induced distortion correction, motion correction, and eddy-current distortion correction are performed using tools in FMRIB's software language (FSL) (Smith et al., 2004;Sotiropoulos et al., 2013;Woolrich et al., 2009). For both HCP datasets, we downloaded the diffusion data after eddy-current correction. We begin with the T1w anatomical images after gradient correction and the raw rs-fMRI. Additional diffusion processing are performed using a combination of tools from Mrtrix3 (https://www.mrtrix.org/), advanced normalization tools (ANTs) (Avants et al., 2011), and the Sherbrooke Connectivity Imaging Lab toolbox in Python (Scilpy) and included resampling to 1 mm isotropic resolution and building the fODFs to prepare for tractography. and Murphy and Fox (2017). Additional processing details are available in Section S2 (Data S1).

| Continuous functional connectivity on the white surface
During the FsFast pipeline, we map the volumetric BOLD signals to the subject's 32k white surface, resulting in a BOLD time series at each vertex on the surface meshes. The FC between any pair of vertices is usually calculated using a partial correlation between the two BOLD time series (Smith, 2012), controlling for confounding signals.
Let s vi t ð Þ be the residual BOLD time series after regressing out nuisance variables at the ith vertex v i , and the FC between any vertex pair (v i , v j ) is calculated as: where we do not consider self-interactions.
Although the FC obtained in (1) can be high resolution with dense surface meshes, it is still considered to be discrete. In our SBCI pipeline, we represent FC in a continuous fashion. Hence, we introduce the continuous FC concept with some mathematical formalism. We use the convention that a vertex is a location on the mesh grid and a point is any location on the surface. Let Ω be the union of two disjoint white surfaces of the brain; the continuous FC is represented as a symmetric function on the space Ω Â Ω. For any pair of points (x,y) Ω Â Ω, we define the continuous FC as where s x and s y are the BOLD time series at points x and y. We define the set of all possible FCs as F FC ¼ Now we have two major problems before getting the continuous FC. First, we only observe BOLD signals at discrete vertices (after the FsFast preprocessing), not every point on Ω. Second, the white surfaces between subjects are different, making any group-wise analyses (analyses across multiple subjects) difficult. To solve these problems we inflate each white surface into a 2-sphere, and with some abuse of notation also denote Ω as the union of two 2-spheres  2 1 [  2 2 . The geometry of a 2-sphere will more easily facilitate signal processing and inter-subject alignment .
To obtain a BOLD signal for any point x Ω, we apply the following signal interpolation method on the 2-sphere. Without loss of generality, assume x 2 1 (the first 2-sphere in Ω) and let B(x; σ) represent a neighborhood near x such that all vertices in B(x; σ) have geodesic distances to x less than σ. We have the BOLD time course s x at x calculated as Þs v 0 , where s v 0 are observed BOLD signals and w σ (x,v 0 ) are the corresponding weights. w σ (x,v 0 ) can be obtained from a truncated Gaussian or bi-weight (quartic) kernel with bandwidth σ defined with a geodesic distance on the 2-sphere, for example, (Risk & Zhu, 2019). When multiple subjects are involved in the analysis, registration between subjects is necessary, i.e., finding correspondence among Ω 1 , …, Ω n for n subjects. One of the advantages of inflating the original complex white surfaces to two 2-spheres is that registration of signals on a 2-sphere is much easier than on an irregular manifold space (Coalson et al., 2018;Glasser, Coalson, et al., 2016;Kurtek et al., 2010;Robinson et al., 2014). In this work, we used the alignment algorithm and registration results from Freesurfer. Other spherical registration algorithms have been proposed in the literature (Robinson et al., 2014).

| Continuous structural connectivity on the white surface
In SBCI, we rely on two recent developments to reliably extend SC to the white surface and represent it in a continuous fashion.  (2017), we define the continuous SC by estimating a probability density function on Ω Â Ω.
In SET, the white surface is used to initiate the flow with a parameter t controlling for the amount of flow into the WM, resulting in a surface beneath the white surface. Starting from the surface at t > 0, we initialize streamlines using N s seed points on the surfaces and propagate them using the particle filtering technique (PFT) (Girard et al., 2014). Figure  We deviate from the traditional parcellation-based approach to SC and define a continuous SC similar to the continuous FC described in Section 2.3. Maintaining both SC and FC in a comparable continuous framework allows for a more robust treatment of the integration of the two modalities. For any two distinct points x and y on the brain surfaces, we define SC as where SC is the probability density function representing the likelihood that x and y are structurally connected by WM fibers. C SC is a symmetric function defined on Ω Â Ω and the set of all SC functions can be denoted as The limited number of streamlines (typically a few million) constructed with SET gives us a discrete and sparse sampling of C SC (refer to Figure 4 panel (a)). In contrast, FC has non-zero values for nearly every connection. In order to more fairly compare and integrate structural and functional connectivities, we estimate a smooth and dense SC using kernel density estimation (KDE) on Ω Â Ω. Assume that we have a symmetric kernel k h defined as a mapping from Ω Â Ω to ℝ + [ {0} with a bandwidth parameter h. The smoothed SC under a standard KDE procedure is given by: where N is the total number of observed streamlines, and (x i , y i ) represents the endpoints of the ith observed streamline. Now, we must define an appropriate kernel function k h on Ω Â Ω. Similar to Moyer et al. (2017) and Risk and Zhu (2019), we begin by defining a symmetric heat kernel (Hartman & Watson, 1974) on a 2-sphere as: where μ 2 and h ℝ + represent the mean and bandwidth, respectively, A 2 = 4π (the area of  2 ), m(m + 1) are the eigenvalues of the Laplacian on  2 for m = 0, 1, …, ∞, P m is the Legendre polynomial of order m for ℝ 3 , N m equals 2m + 1, the number of linearly independent homogeneous spherical harmonics of degree m in ℝ 3 , and ⟨,⟩ indicates the inner product of two elements on  2 . Since Ω ¼  2 1 [  2 2 , we extend f h to Ω trivially by letting f h (x; μ) = 0 if x and μ are not on the same sphere. We then define a kernel function on the domain Ω Â Ω using a product of two functions f h on Ω. That is, given a mean (μ x , μ y ) the The bandwidth of the kernel function is a key parameter under KDE. Although many bandwidth selection criterion have been proposed (Botev, Grotowski, & Kroese, 2010;A. W. Bowman, 1984; C. ; M. C. Moyer et al., 2017;Risk & Zhu, 2019;Turlach, 1993, January;Zhengwu Zhang, Klassen, & Srivastava, 2019), there is no consensus on the best general approach. In this paper, we select h based on the reproducibility ofĈ SC x, y ð Þ calculated on the HCPTR dataset.

| SC-FC coupling
We are ready to study the integration of continuous SC and FC defined on the domain Ω Â Ω. We define three novel definitions of SC-FC coupling (SFC) based on our continuous SC and FC. Continuous global SC-FC coupling: We first evaluate the consistency between SC and FC on the surface without any predefined parcellation of the brain. Let C SC x, y ð Þ and C FC x, y ð Þ denote the continuous SC and FC for a particular subject. At any point x 0 in Ω, we define the global SC-FC coupling using a normalized inner product of two The inner product h,i is defined for two functions on Ω and is analogous to calculating the correlation between two rows of a functional connectivity and structural connectivity matrix in a traditional setting.
The SFC gbl (x 0 ) returns a scalar at x 0 , and therefore, SFC gbl is a continuous function on Ω, measuring the similarity/consistency between SC and FC at different locations on the white surface. In practice, we evaluate the SFC gbl on a discrete grid of Ω. The mesh surfaces from Freesurfer provide a natural choice for such a grid. However, the dense vertices on these surfaces cause computational challenges. In our implementation of SBCI, we down-sample each white surface mesh (left and right) from over 120,000 vertices to around 2,100 using the Visualization Toolkit (VTK) in Python (Schroeder, Martin, & Lorensen, 2006;Schroeder, Zarge, & Lorensen, 1992). To maintain as much topological information as possible, we sample vertices from the white surface such that the induced Hausdorff distance (Aspert, Santa-Cruz, & Ebrahimi, 2002) between the full and down-sampled meshes is minimized. We then generate the down-sampled white, inflated, and spherical surface meshes using Delaunay triangulation (Barber, Dobkin, & Huhdanpaa, 1996) and the coordinates corresponding to the sampled vertices on the full meshes. Figure 3 shows a surface mesh before (panel (a)) and after down-sampling at three different sparsity levels of vertices (panels (b), (c) and (d) Wang et al., 2018;Zhiqiang Zhang et al., 2011), we define discrete SFC for a given parcellation. We first convert our continuous SC and FC to finite adjacency matrices based on the given parcellation. For any two ROIs E 1 and E 2 , we define the SC between the two regions as: where jE i j represents the area of region E i for i = 1, 2. The defined SC strength represents the connectivity density in a unit area square.
The traditional way to calculate FC is to first obtain a mean BOLD signal for each region and then calculate the Pearson correlation coefficient. To be more consistent with the SC calculation, we instead consider an average of the correlations. We calculate the FC in the following way: first apply the Fisher z-transformation to the correlation, calculate the average, and then apply the inverse Fisher ztransformation: where tanh(Á) and artanh(Á) represent the hyperbolic tangent function and its inverse respectively. Finally, we define the discrete SC-FC coupling (SFC dct ) as: where corr(Á, Á) represents the Pearson correlation and SC(E, Á) and FC (E, Á) represent the structural and functional connections between ROI E and all other ROIs respectively.  Zhang et al., 2018), which is a generalization of the intraclass correlation coefficient (ICC) (Shrout & Fleiss, 1979), with values in the range (0, 1). d 2 bs and d 2 ws represent the average distance squared between subjects and within multiple scans of a subject respectively. The dICC is calculated under the Frobenius norm using the entire connectivity matrix. Higher dICC values indicate better reproducibility.

| Evaluation and analysis
The surface smoothing kernel FWHM σ is another parameter in SBCI that can be tuned for the functional data. We selected σ = 5 mm to remain consistent with typical fMRI preprocessing procedures.

| SBCI connectome reproducibility
To validate the SBCI pipeline, we performed qualitative and quantitative exploratory analyses to assess the reliability of our pipeline and compared them to previous studies. These analyses fall into two categories: SC and FC reproducibility and the relationship between SC and FC. Comparisons were made using the atlas-free approach and atlas-based approaches. With the atlas-free approach, we performed visual inspections of SC and FC at different spatial resolutions (different grids on Ω, refer to Figure 3) and calculated the ICC and dICC using the HCPTR dataset. While ICC produces a value at every vertex, allowing for the calculation of summary statistics of the distribution, dICC produces only a single value per connectome. As such, to account for variability we obtained 10,000 bootstrap samples (randomly sampling 36 subjects with replacement) and took the median dICC value and interquartile range (IQR). IQR is defined as the difference between the 75th and 25th percentiles. Given the bootstrap results, we also calculated the p-value P(dICC < 0.5), where dICC < 0.5 indicates that the measurement is not reproducible. Note that in each test and retest scanning session in the HCPTR dataset, we had one dMRI scan and four resting state fMRI runs, resulting in a total of 72 SC matrices and 288 FC matrices for the entire dataset.

| SC-FC coupling reproducibility
After confirming the validity of our SBCI pipeline to produce consistent and reproducible SC and FC at both standard atlas and atlas-free resolutions, we sought to examine the reproducibility of the SFC features. After registering each subject's images to the standard fsaverage space, we obtained aligned SC and FC on a common white surface. Using the SFC definitions presented in Section 2.5, we quantified the reproducibility of each feature using both ICC and dICC measures. For each subject in the HCPTR dataset, we had 2 SC matrices and 4 FC matrices, and therefore obtained 8 (= 2 Â 4) SFC measures for each definition of SFC for each subject, resulting in 288 (= 8 Â 36) total features for all 36 subjects. The five SFC features considered in this paper were: global SC-FC coupling (SFC gbl ; Equation (4)); local SC-FC coupling based on the Desikan-Killiany atlas (SFC locdk ; Equation (5)); local SC-FC coupling using major brain lobes (SFC loclb ; Equation (5)); discrete SC-FC coupling based on different atlases (SFC dct.. ) defined by Equation (8); and traditional discrete SC-FC coupling (SFC trd.. ), where SC is calculated based the summation of streamlines in a connection and FC is calculated based on the Pearson correlation of the mean BOLD signals in each ROI pair.
Note that for SFC loclb the lobar parcellation was created by merging ROIs within the Desikan-Killiany atlas to achieve 12 larger areas (right and left frontal, parietal, temporal and occipital lobes, and the right and left insula and cingulate).

| SC-FC coupling as imaging markers
In order to assess the usefulness of the SFC features as imaging markers, we used the HCPYA dataset, described in Section 2.1, to find group differences due to sex. We first performed independent t-tests on the surface.
A nonparametric suprathreshold cluster test (Nichols & Holmes, 2002) was conducted to determine significant regional differences between the two groups. Clusters were defined as those regions of neighboring vertices at which the uncorrected p-values (from the t-tests) between the two groups were ≤ 0.05. We then generated a permutation distribution for the maximal cluster size and determined a critical suprathreshold cluster size with threshold p ≤ 0.05. All vertices within a significantly large cluster were deemed to be significant. Suprathreshold cluster tests like this are generally more powerful for neuroimaging data than techniques such as Bonferroni adjustment and Benjamini-Hochberg false discovery rate control (Friston et al., 1994;Nichols & Holmes, 2002).
We also conducted a classification analysis to compare the discriminative abilities of different SFC features in distinguishing males from females. Principle component analysis (PCA) was first performed on the training data and applied to the test data to reduce the dimensionality of each SFC feature before different classifiers were applied.
To make a robust comparison, results were obtained from different classifiers including the logistic regression classifier (LRC), support vector classifier (SVC), and random forest classifier (RFC). Repeated five fold cross-validation was performed (100 times), tuning hyperparameters through nested cross-validation, and the receiver operator characteristic (ROC) area under the curve (AUC) was used to evaluate the classification performance for each iteration.

| Parameter selection
We selected t = 75 and N s = 3 Â 10 6 for all subsequent experiments. Details are presented in Data S1. The flow parameter t = 75 maximizes the dICC and thus SC reproducibility regardless of the number of seeds used for every approach except the Desikan-Killiany atlas. The number of seeds selected was determined based on a diminishing return of dICC as a function of computational resources.
We then sought an optimal value for h to maximize the dICC of the atlas-free approach. Table 1 shows dICC values for the four approaches with five different KDE bandwidth values, h {0, 0.002, 0.005, 0.01, 0.02} after removing two outliers. As our pipeline is designed to conduct connectivity analyses using an atlas-free approach, we selected h = 0.005 because it maximized the dICC value when using the atlasfree approach and was still close to maximizing the atlas-based approaches. We found that smoothing was especially helpful for enhancing the reproducibility of high resolution SC. In fact, after smoothing, the dICC for our atlas-free approach was nearly the same as all three parcellations, which is a substantial improvement over the original unsmoothed SC (h = 0). Using an Intel CPU (2.70 GHz), it took approximately 8, 4, and 2 hrs to smooth SC matrices obtained using grids Ω of 16,906, 10,314, and 3,668 vertices, respectively, producing SC matrices requiring 1.1GB, 0.5GB, and 0.1GB of storage. The full SBCI pipeline took approximately 4-5 days (10 hrs for fMRI processing, 1 day for T1w processing, 2-3 days for dMRI processing, and 12 hrs for connectome integration) to run for a single subject. Next, we evaluated the relationship between SC and FC. In Figure 5, we show the histograms of FC strengths with and without direct SC connections. In our continuous SC framework, we defined a direct SC connection as those node-pairs with SC values greater than 10 À7 . This means if ten million streamlines were built for a subject, the connections that had more than one expected streamline were considered to have a direct SC connection. We found that FC strengths between regions with direct SC is consistently higher than FC strengths between regions without direct SC connections (p-value < 10 À6 ).

| Exploratory connectome analyses
Finally, we quantitatively assessed the reproducibility of SC and FC produced by SBCI. We compared the dICC and node-wise ICCs

| SC-FC coupling reproducibility
We then evaluated the reproducibility of our SFC features. Figure 7 shows the pairwise distance matrices, dICC, and ICC histograms for  (Desikan), SFC dctds (Destrieux), and SFC dctbr (Brainnetome) were 0.60 (0.01), 0.61 (0.01), and 0.62 (0.01) respectively. All p-values were less than 10 À6 . The ICC F I G U R E 5 Histograms of resting state FC for nodes-pairs with and without direct WM SC connections. Elements from the FC matrix were assigned to one of two groups depending on whether or not there is a value > 10 À7 present at the corresponding element in the SC matrix. The histograms show the distribution of FC strength for both of these groups using only those connections within a region (a, b, d, and e) (defined by the Desikan-Killiany atlas) or only those connections shared across regions (c, f, g, and h). Finally, (i) shows the histogram for the whole brain using every connection over the entire surface values for the discrete SFC features were all in the poor reliability range.
Comparing results in Figure 7 with the ones in Figure 6, Figures S3, S5, and S6.
Our results show that the within-subject SFC gbl , SFC locdk , SFC loclb , SFC dctdk , SFC dctds , and SFC dctbr were reproducible (although to different degrees) in data from two sessions that occurred in the span of several months in healthy young adults, and the SFC gbl was more variable between subjects than within subjects. Therefore, we expected the SFC features to be informative and robust markers to detect individual or group effects.

| SC-FC coupling sex difference
Finally, we evaluated the usefulness of our SFC features as markers to detect group differences in sex. We used a small subset of the HCPYA data (due to high computational times and storage demand for each subject we have not processed all HCPYA subjects yet) containing 43 randomly selected males and 46 randomly selected females from the 26 to 30 year old group in the S500 data release. Figure 8 shows the average continuous (atlas-free) SFC features (SFC gbl , SFC locdk , and SFC loclb ) across all males and females on the inflated brain surfaces, and Figure S7 displays the average discrete SFC features (SFC dctdk , SFC dctds , and SFC dctbr ). Additionally, using the three continuous SFC features, we conducted point-wise independent t-tests on the surface between males and females. Figure 9 shows the p-values on the surface (the first two rows) and binary maps (the third and fourth rows) indicating significant differences after correcting for multiple comparisons based the nonparametric suprathreshold cluster test.  Table S4.
F I G U R E 6 Reproducibility analysis of SC and FC from SBCI. The first row shows pairwise distance matrices of SCs and FCs generated under (a) the Desikan-Killiany atlas and (b) our atlas-free approach. The Frobenius norm was used to calculate the distance between connectivity matrices, and the different scales are due to the differences in the connectivity matrices generated by each method (the relative difference between the intra-subject distances and the intersubject distances is more important than the magnitude of those distances). Note that we had 72 SC and 288 FC matrices in the HCPTR dataset. Medians (IQR) from bootstrap sampling are displayed above each distance matrix. Panels (c) and (d) show ICC histograms for every node-pair in the SCs and FCs corresponding to panels (a) and (b) respectively. Medians (IQR) are displayed above each histogram. We placed scans from the same subject next to each other, so that we could observe a block pattern along the diagonal of these distance matrices F I G U R E 7 SFC reproducibility analysis. The first and third rows from left to right show pairwise distance matrices for the SFC gbl , SFC locdk , and SFC loclb and the SFC dctdk (Desikan), SFC dctds (Destrieux), and SFC dctbr (Brainnetome) features respectively. Medians (IQR) from bootstrap sampling are displayed above each distance matrix. The second and fourth rows show the corresponding histograms of ICC at each node. Medians (IQR) are displayed above each histogram

| DISCUSSION
In this paper, we developed a novel atlas-free approach for studying connectivity integration. The proposed SBCI framework can extract reproducible and discriminative high-resolution structural connectivity (SC) and functional connectivity (FC) from high-quality MRI data on the white surface of the brain. Additionally, SBCI produces three novel subject-level imaging markers that are reflective of the relationships between structural and functional brain signals. By using the HCP avoid the need to rely on discrete brain parcellations to study connectivity. Further, by extending the typically sparse SC to be a continuous feature, we close the sparsity gap between SC and FC, allowing us to interrogate structural-functional relationships more robustly, as FC is naturally more dense. This continuous treatment of SC also allows us to overcome some of the computational challenges that come with big data; we do not need to recover hundreds of millions of streamlines in order to obtain reproducible SC at high resolutions (refer to Table 1 and Table S3).
Another advantage of SBCI is that we can inspect SC-FC relationships in local brain regions as well as across the entire cortical surface (Honey et al., 2009). Figure 5 shows that in general, FC connection strengths are higher between areas of the brain when direct SC connections are present compared to those without direct SC connec-  (Chamberland et al., 2017;Honey et al., 2009), implying that more care should be taken in the future to more completely examine such relationships.
Although the idea of dense FC and SC has been proposed before (e.g. the HCP promotes the use of brain mesh surfaces to analyze fMRI data (Coalson et al., 2018;Glasser, Coalson, et al., 2016) and Moyer et al. (2017) proposed a point process model to estimate continuous SC), continuous structural and functional connectome coupling has not been studied before to the best of our knowledge. In F I G U R E 8 Mean SFC gbl , SFC locdk , and SFC loclb values calculated over 43 male and 46 female subjects existing literature, using predefined parcellations to calculate discrete SFC is a standard choice (Baum et al., 2020;Buckner et al., 2013;Chamberland et al., 2017;Cocchi et al., 2014;Ghumman et al., 2016;Honey et al., 2009;Honey et al., 2010;Jiang et al., 2019)  From the first two columns in Figure 8, we found that the global SC and FC are more correlated in primary sensory/motor areas such as S1/M1 and the visual and auditory cortices and less correlated in secondary association areas like the prefrontal cortex.
The spatial distribution of SFC gbl seems to be consistent with the fundamental organizing architecture of the brain known since Brodmann's map was published in 1909 (Brodmann, 2007).
Brodmann's work began with the idea that "specific physiological functions in the cerebral cortex depend on specific histological structure and connectivity." This principle is clear in primary sensory areas where specific histological patterns and cortical layer structures are closely associated with functional activity (Zeki, 2016). More modern mapping identified the hierarchical map organization according to unimodal and multimodal association areas (Mesulam, 2000). Our global SFC feature peaks in sensory areas and gradually drop into the unimodal and multimodal association areas. These observations highlight the ability of our approach to advance brain mapping using modern data measurement techniques based on MRI.

| SFC differences between males and females
The difference between the brains of males and females remains an interesting question that is largely unanswered. Past findings show that male and female brains have anatomical, functional, and biochemical differences (Weis et al., 2020;Zaidi, 2010). With SBCI, sex differences were observed most strongly in the ventro-medial prefrontal cortex, the somatosensory-motor areas, the supramarginal gyrus, and the occipitoparietal (before FDR) areas extending into the fusiform gyrus. These results emphasize how structure and function are differentially related between the sexes and are consistent with reported behavioral differences between men and women. For example, men and women perform differently on emotional recognition (Lausen & Schacht, 2018) and emotional decision making (van den Bos, Homberg, & de Visser, 2013), which is well known to engage the ventro-medial prefrontal cortex (Bechara, Tranel, & Damasio, 2000). Men and women are also known to perform differently on facial recognition (Herlitz & Lovén, 2013), visual motion processing (S. O. Murray et al., 2018), and episodic memory recollection (Yonker, Eriksson, Nilsson, & Herlitz, 2003). Direct correlation between our measure of sexrelated SFC differences and cognitive-behavioral differences between the sexes remains to be tested.
Our classification results (  (Messé, 2019). Given the predictive powers of our SFC features to distinguish sex, we expect that these measures will also act as robust connectivity biomarkers for pathological applications, to be explored in future studies.

| Future work
One limitation of the SBCI pipeline is that SET is sensitive to the input surfaces according to our results from the HCPTR dataset. SET uses the geometry of the surface a priori to initiate WM streamlines. If surface reconstructions are significantly different across two scanning sessions for a single subject, then SBCI is unlikely to produce reliable SC for that subject. As such, we removed two outliers as determined by the differences in white surface reconstructions between the test and retest scans for our reproducibility analyses. Future work for improving SBCI should also focus on faster and more robust surface reconstruction (e.g. better segmentation and surface reconstruction methods (Henschel et al., 2020;Zhao et al., 2019) or collecting and incorporating high resolution T2-weighted (T2w) or fluid attenuated inversion recovery (FLAIR) images to the T1w processing pipeline (Glasser et al., 2013;Renvall, Witzel, Wald, & Polimeni, 2016;Van Essen et al., 2013;Zaretskaya, Fischl, Reuter, Renvall, & Polimeni, 2018). Further, the fMRI signal resampling and smoothing of SC were conducted on the spheres and the actual mapping between the white surfaces and spheres may introduce distortions. Another future work is to develop a more direct method where the resampling and SC smoothing are performed on the original white surfaces and compare it with the current spherical framework.
The computational resources required to perform SBCI, while not prohibitive, provide a challenge for analyzing large datasets. Future work will involve decreasing the total processing time. For example, deep learning has made it possible to reconstruct reliable surfaces in a fraction of the time required to perform recon_all (Henschel et al., 2020). Additionally, improvements to the KDE smoothing algorithm employed in this pipeline should also reduce processing time. As other faster diffusion and functional preprocessing pipelines (provided by FSL and Freesufer) become available, the computational cost accompanied by SBCI should decrease.
Finally, the use of subcortical structures is crucial to understanding the effects of various pathologies on SC and FC. As such, future developments to SBCI include extending both SC and FC to include subcortical regions in order to study functional networks beyond the cortical areas.
We have demonstrated that the SBCI pipeline can reliably reconstruct high resolution SC and FC and produce novel SFC features, removing the need of using predefined brain parcellations for connectomics studies that can potentially bias their results. Future work should use SBCI to explore pathological differences in various neurological diseases including HIV-associated cerebral small vessel disease (CSVD), mild cognitive impairment (MCI), Alzheimer's disease (AD), and chronic pain, among others. Finally, we intend to use SBCI to determine an optimal brain parcellation for studying structural, functional, and structural-functional coupling for individual and group effects.