Group linear non-Gaussian component analysis with applications to neuroimaging

Independent component analysis (ICA) is an unsupervised learning method popular in functional magnetic resonance imaging (fMRI). Group ICA has been used to search for biomarkers in neurological disorders including autism spectrum disorder and dementia. However, current methods use a principal component analysis (PCA) step that may remove low-variance features. Linear non-Gaussian component analysis (LNGCA) enables simultaneous dimension reduction and feature estimation including low-variance features in single-subject fMRI. A group LNGCA model is proposed to extract group components shared by more than one subject. Unlike group ICA methods, this novel approach also estimates individual (subject-specific) components orthogonal to the group components. To determine the total number of components in each subject, a parametric resampling test is proposed that samples spatially correlated Gaussian noise to match the spatial dependence observed in data. In simulations, estimated group components achieve higher accuracy compared to group ICA. The method is applied to a resting-state fMRI study on autism spectrum disorder in 342 children (252 typically developing, 90 with autism), where the group signals include resting-state networks. The discovered group components appear to exhibit different levels of temporal engagement in autism versus typically developing children, as revealed using group LNGCA. This novel approach to matrix decomposition is a promising direction for feature detection in neuroimaging.


S.2.1 Robustness to misspecified number of group components
In this section, we provide the results of group LNGCA and group ICA on group components in simulations when the number of group components is misspecified: q G = 4 and q G = 2.
First when q G = 4, both methods perform similar to the case when q G = 3: the three components matched to true group components have similar correlation as when q G = 3, showed in Although the test underestimated the dimensions in many simulations, this was due to possibly missing individual components, while it always retained the group components.

S.2.2 Robustness to the number of time points
To examine the robustness of our method to varying number of time points, we also conduct our simulations with T = 30 and T = 70. We keep all settings fixed except for the number of Gaussian

S.2.3 Decomposition of subject deviations from group signals
We conduct one repetition of our high SVAR simulation setting according to the previous SVAR setting except we add subject-specific deviations in two subjects: one has extra active pixels on the top of the "1" component, and one has extra active pixels on the bottom of the "1" component, as in Fig. S.9. The estimated group signal from group ICA is slightly worse than that from group LNGCA. We plot the individual components from the two subjects that have the highest correlation with the true subject-deviation component. We see group LNGCA successfully captures the individual deviations with relatively high correlation (0.6).
We describe an approach to identify the individual components that represent subject-specific deviations from the group components. We refer to the components from the initial subject-level LNGCA (Step 1 of Algorithm 1) as the separate-subject components. We refer to the individual S.3 Details on resting-state fMRI data example S.3.1 Additional information on the data and preprocessing All children completed a mock scan to acclimate to the scanning environment. Participants were instructed to relax, fixate on a cross-hair, and remain as still as possible. Functional data were preprocessed using SPM12 and custom MATLAB code (https://github.com/KKI-CNIR/CNIR-fmri_ preproc_toolbox). Rs-fMRI scans were slice-time adjusted using the slice acquired in the middle of the TR as a reference, and rigid body realignment parameters were estimated to adjust for motion. The volume collected in the middle of the scan was non-linearly registered to Montreal Neurological Institute (MNI) space using the MNI EPI template. The estimated rigid body and nonlinear spatial transformations were applied to the functional data in one step, producing 2-mm isotropic voxels in MNI space. Voxel time series were linearly detrended. Data were excluded for between-volume translational movements > 3-mm or rotational movements > 3 degrees.
Group ICA and its PCA steps were applied using GIFT. The second stage PCA was implemented using multi-power iterations (Rachakonda et al., 2016).

S.3.2 Dimension estimation
We applied the NG subspace dimension test in Section 2.3 to six participants. We also implemented a sequential test with FOBIasymp and the estimated dimensions are 126, 126, 126, 148, 151 and 153 for participants #1,. . . ,#6 correspondingly. Such large dimension will not help reduce much computation in practice. It also implies FOBIasymp tends to overestimate the number of non-Gaussian components, as discussed in our simulation.

S.3.3 Subject-specific components
Example subject-specific components are depicted in Figure S.10. Figure S.10: Example subject-specific (individual) components from four different participants. These components include artifacts. Activation near the brain edge, as in the first and third rows, is often indicative of a motion artifact.