Tensor clustering on outer-product of coefficient and component matrices of independent component analysis for reliable functional magnetic resonance imaging data decomposition

BACKGROUND
Stability of spatial components is frequently used as a post-hoc selection criteria for choosing the dimensionality of an independent component analysis (ICA) of functional magnetic resonance imaging (fMRI) data. Although the stability of the ICA temporal courses differs from that of spatial components, temporal stability has not been considered during dimensionality decisions.


NEW METHOD
The current study aims to (1) develop an algorithm to incorporate temporal course stability into dimensionality selection and (2) test the impact of temporal course on the stability of the ICA decomposition of fMRI data via tensor clustering. Resting state fMRI data were analyzed with two popular ICA algorithms, InfomaxICA and FastICA, using our new method and results were compared with model order selection based on spatial or temporal criteria alone.


RESULTS
Hierarchical clustering indicated that the stability of the ICA decomposition incorporating spatiotemporal tensor information performed similarly when compared to current best practice. However, we found that component spatiotemporal stability and convergence of the model varied significantly with model order. Considering both may lead to methodological improvements for determining ICA model order. Selected components were also significantly associated with relevant behavioral variables. Comparison with Existing Method: The ﻿Kullback-Leibler information criterion algorithm suggests the optimal model order for group ICA is 40, compared to the proposed method with an optimal model order of 20.


CONCLUSION
The current study sheds new light on the importance of temporal course variability in ICA of fMRI data.

2017; Seifritz et al., 2002;Tavor et al., 2016), Diffusion Tensor Imaging (DTI) (Wu et al., 2015) and multi-modal MRI data fusion (Calhoun et al., 2006;Sui et al., 2012). Despite its widespread use in neuroimaging, there is limited standardization for ICA methodology in the field that limits the replicability of previous studies and comparisons between research protocols.
For ICA decompositions, the model order (number of extracted components) is an important factor that effects the reliability of the results (Abou-Elseoud et al., 2010;Allen et al., 2012;Li et al., 2007). Because brain function is complex, it is impossible to know the "true" number of components within the brain and previous work has shown that different brain networks are split at higher dimensionalities, whereas others do not appear at lower dimensionalities at all. Further investigation of the methods used to determine model order in the ICA decomposition are needed to improve replicability and standardization between studies.
Currently, there are no standardized methods for determining the optimal dimensionality. For any one ICA algorithm, information theoretic criteria are typically used to determine model order, and the stability of components may be assessed post-hoc as additional validation of the model order. Different ICA algorithms produce different optimal model orders from the same data. The stability of the ICA algorithm itself is partially based on the convergence of the decomposition. Selfadaptive ICA algorithms converge to a local minima (Himberg et al., 2004), and algorithms which fail to converge over many iterations are considered unstable. The convergence performance can vary as the number of iterations required to converge increase, which can lead to unreliable spatial maps or time courses across analyses of the same data that limit the interpretability and reproducibility of findings.
For ICA of multi-subject, or group, fMRI studies, data from all participants can be temporally concatenated, spatially concatenated, or concatenated by spatiotemporal tensor (Shi et al., 2017) prior to applying ICA. Post-hoc assessment of the stability of the ICA decomposition is traditionally evaluated on the basis of resulting IC spatial patterns, as opposed to their corresponding temporal courses. There is limited research addressing the effect of temporal courses on ICA stability. BrainVoyager (Formisano et al., 2004), FMRIB Software Library (FSL; Beckmann and Smith, 2004;Beckmann et al., 2005), and Group ICA of FMRI Toolbox (GIFT; Calhoun et al., 2001) are three widely used software packages used for ICA of fMRI data. BrainVoyager and FSL do not include functionality for post-hoc evaluation of the stability of the algorithm, whereas GIFT (Calhoun et al., 2001) includes assessment of the stability of the ICA via resulting spatial patterns using ICASSO (Himberg et al., 2004). However, ICASSO does not assess the stability of the inverse of the coefficient matrices (i.e., temporal courses for fMRI data analysis). Although some studies have proposed additional methods to evaluate the stability of ICA algorithsm (Wisner et al., 2013;Meindl et al., 2010), none address the stability of temporal data.
Both the resulting spatial maps and their temporal courses provide important information in the context of fMRI research (Tian et al., 2013). Spatial maps provide information about brain network connectivity, and allows for in vivo mapping of neural circuitry in the human brain (Zuo et al., 2010). The temporal courses of the independent component spatial maps are crucial for interpreting each IC spatial map (Song et al., 2015(Song et al., , 2014 e.g., as reflecting a brain network, a motion artifact, or other noise-related signals. E.g., the temporal courses that are highly correlated with stimulus presentation during tasks reflect task networks, while those highly correlated with motion time courses reflect subject motion. Alternatively, IC temporal courses may be comprised of low-frequency oscillatory signals associated with resting state networks (Jafri et al., 2008). Furthermore, time course variability has also been associated with pathological processes, such as chronic pain (Malinen et al., 2010). Although the importance of temporal courses is known, it remains unclear whether instability in temporal courses may also be useful for assessing the performance of the ICA decomposition and replicability of findings.
In order to address this gap in the literature, we propose a new method, Tensor Clustering, which will allow us to assess both temporal and spatial stability in the ICA decomposition. We create a "tensor" by fusing information from both temporal courses and spatial patterns mathematically. Then, we apply hierarchical cluster analysis within the tensor space to evaluate the stability of ICA decomposition. The aims of the present study are to (1) determine if the use of spatiotemporal information affects the stability and/or convergence of a given ICA algorithm and (2) to evaluate the incremental utility of incorporating spatiotemporal tensors into ICA decomposition. The performance of ICA of spatiotemporal fMRI data was assessed for two popular ICA algorithms, InfomaxICA and FastICA, using both simulated and in vivo human fMRI data.

ICA model
When ICA is used to decompose 4-D fMRI data (a timeseries of 3D brain images), it is usually assumed that the measured fMRI signals are produced by multiple source signals. This can be expressed by the following noise-free ICA model: Where, ∈ × Z N M are the fMRI data, N is the number of image timepoints and M is the number of voxels in each fMRI image. ∈ × A N R denotes the mixing matrix, with each column representing the temporal course of each source. R is the number of extracted components (i.e., model order) for the ICA algorithm. ∈ × S R M represents the source matrix, where each row of the matrix is a spatial map of each source. ICA is an algorithm that estimates matrices A and S in a purely datadriven fashion. Group ICA (GICA) is usually adopted for multi-subject fMRI data analysis. For large data, such as fMRI data, the data are typically reduced prior to ICA using principal component analysis (PCA). Reduction in this manner greatly reduces the size of the data and the number of parameters that must be estimated during ICA. There are differences among BrainVoyager, FSL, and GIFT in this data reduction step prior to GICA of multi-subject fMRI data; with BrainVoyager and GIFT implementing data reduction steps at the single-subject level and again at the group-level (on concatenated reduced data) and FSL implementing only group-level PCA data reduction steps. Considering both subject-and group-level PCA-based data reduction, after data reduction at the individual level, the reduced data from individual subjects are concatenated temporally, then the aggregate data matrix is decomposed by ICA (details see Appendix. A).
The ICA model in Eq.
(1) can also be expressed by the sum of a rank-1 matrix: Where, ∈ × a r N 1 is the r th column of A, e.g., the temporal course of the r th component. ∈ × s r M 1 is the r th row of S, e.g., the spatial pattern or distribution of r th component. Previous research (Cong et al., 2011) has shown that the estimation of a r or s r can be biased by magnitude and polarity indeterminacy. However, the rank-1 matrix E r in Eq.
The rank-1 matrix ∈ × E r N M is the outer product of a temporal course and its corresponding spatial distribution. For GICA, the rank-1 matrices of different subjects are unique. For r th component and m th subject, the rank-1 matrix is calculated as follows: is from Eq. (A.6). V m is the data reduction matrix of the m th subject. G is the data reduction matrix for the aggregate data matrix in temporal concatenated GICA. ∈ × W R R is the unmixing matrix estimated by the ICA algorithm. y r is the r th spatial component estimated by the ICA algorithm. In this study, the stability of E m r was evaluated. The information from both temporal courses and spatial distributions was used to assess the stability of ICA algorithm.

Convergence of ICA algorithm
InfomaxICA algorithm. For InfomaxICA (Bell and Sejnowski, 1995), the formula used for iterations is as follows: Where x is an input vector, y is an output vector from monotonic transformation by a nonlinear function, 1 is a vector of ones. W is the unmixing matrix for the ICA decomposition. In each Infomax ICA decomposition, the W is initialized randomly, and updated based on Eq. (5). Stochastic gradient ascent is used to find the unmixing matrix W, while the sigmoidal nonlinearity provides necessary higher-order statistical information Makeig et al., 1997). W Δ is the change in W between iterations. The algorithm converges when the norm of W Δ is less than some number, for example − 10 6 is the default of InfomaxICA algorithm implemented in MATLAB (Mathworks, Natick, MA). The W at this time is the unmixing matrix of ICA algorithm for given data. The number of iterations updated in this process determines the number of steps required for algorithm convergence. For In-fomaxICA implemented in MATLAB, the default iteration limit is 512, after which the process terminates if convergence is not reached.
FastICA algorithm. For FastICA (Hyvarinen, 1999), the formula used for iterations is as follows: where • g ( ) is a nonlinear function related to the probability density function of the source signals. ′ g (·) is the first derivative of • g ( ). E[ ] denotes the expectation of the variable. In this study, we used a symmetric approach, with the nonlinearity being tanh (Correa et al., 2007).
To study the convergence of each ICA algorithm, the ICA algorithm was run 50 times after choosing appropriate parameters. The total number of runs that converged was recorded as an index of performance of the ICA algorithm.

Hierarchical cluster analysis of tensors
The stability of spatial and temporal information from ICA is different in practice. To calculate temporal courses of ICA components, a matrix inversion is used, which magnifies the instability of their estimation. We show this as follows. First, a matrix ∈ × W 20 20 was generated to represent the unmixing matrix from ICA decomposition with a dimensionality of 20, and then noise was added to W (16 th -20 th components) to represent the effect of ICA decomposition instability on the unmixing matrix, which together is W _noisy. The coefficient matrices A and A _noisy are calculated via the inverses of W and W _noisy, respectively. The correlation between A and A _noisy represents an index of stability of the temporal courses. As shown in Fig. 1, the correlation coefficients of W and W _noisy are equal to 1 for the first 15 components and less than 1 for the last 5 components. However, the correlation coefficients between A and A _noisy are less than 1 for all components, which represents instability. The example shows that the stability of the matrix inversion is sensitive to noise, which directly impacts the temporal courses.
Tensor Clustering was applied to assess the stability of E r , the outer product of a temporal course and its corresponding spatial distribution. Fig. 2 shows an illustration of the tensor for assessing fMRI ICA results. Multiple decompositions are ran using each ICA algorithm (InfomaxICA or FastICA). Each independent component for each decomposition has a temporal course and a corresponding spatial map. The outer-product of them is a rank-1 matrix. The number of rank-1 matrices is × R K when the model order is R and the same algorithm is run K times. The set of rank-1 matrices is stacked into a tensor. The stability of the rank-1 matrices can then be assessed with Hierarchical Clustering Analyses in tensor space, which identifies clusters based on the similarity of the rank-1 matrices. Ideally, R classes (or clusters) comprised of the same component from the K different runs will be identified. This process is similar to ICASSO, but incorporates temporal information via the rank-1 matrix.
The calculation of the similarity matrix of E r used in Tensor Clustering can be computed from the following formula: Where Sim ij represents the value of i th row and j th column element in the similarity matrix, with ∈ × Sim RK RK being the similarity matrix. E i and E j are the rank-1 matrix reconstructed from temporal course and corresponding spatial distribution. ⨀ represents Hadamard product. Because of the standardization and orthogonalization of the ICA, the formula can be simplified as follows: The formula shows that the similarity matrix used for Tensor Clustering can be expressed as the Hadamard product of the similarity matrix for the temporal courses and the spatial distribution (details can be found in Appendix. C). Using Eq. 10 to estimate the similarity matrix reduces the size of computer memory necessary for clustering and improves the speed of the algorithm.
Tensor clustering will result in R classes, ideally equal to the number of components estimated. For each class, there should be K components (e.g., each class should cluster together all 50 runs of the same component for a given class, with one class per independent component). The degree of intra-class similarity should be very high if each class consists of all 50 runs for a single component, whereas the inter-class similarity inter-class should be very low if each class represents a single stable distinct independent component. The cluster quality index, or I q , calculated as the difference between average intraclass similarities and average inter-class similarities, can be used as an index to evaluate the stability of the results (Himburg et al., 2004) r S ( ) int represents the mean intra-class similarity, r S ( ) ext represents the mean inter-class similarity. = ⋯ r 1,2, ,R, represents r th component of ICA decomposition. While I q can be calculated from tensor clustering or clustering based on the component or coefficient matrices, e.g., independent component spatial maps or temporal courses respectively, there are no constraints on the independence of the temporal courses for spatial ICA. Thus, we use intra-class similarity to compare tensor, spatial, and temporal clustering with each other as a function of model order.
There is a slight difference in the stability analysis for subject-level versus group ICA. For subject-level ICA, the reconstructed rank-1 matrix is calculated with coefficient and component matrices. For GICA, it is necessary to reconstruct the rank-1 matrix using Eq. 4, which allows the stability of each subject's rank-1 matrix to be calculated. After the stability index of each component is calculated, the average of all components can be used as a stability index for the ICA algorithm at a given dimensionality.

Stability of components across model orders
Hierarchical cluster analysis was applied to all components, for each model order, to determine how independent component stability varied with ICA dimensionality. The clustering analysis was repeated with different dimensionalities to obtain the stable components with the specific rank-1 matrix component Ē (i.e., spatiotemporal tensor) used as the template. The similarity with the template for temporal and spatial information alone was also calculated. Finally, components were compared to 10 replicable brain networks (Damoiseaux et al., 2006), to determine the validity of results.

Simulated data
Simulated data constructed to compare tensor clustering with standard methods. Data were generated through the linear mixing of source signals and noise (SNR: 20 dB)., with source signals as shown in Fig. 3 (http://mlsp.umbc.edu/simulated_fmri_data.html). The simulated data were decomposed with InfomaxICA and FastICA 50 times, with multiple model orders (2 < dimensionality < 10).  Illustration of spatiotemporal tensors. The tensor of stacked rank-1 matrices that is fed into the hierarchical cluster analysis to evaluate the stability of temporal and spatial information simultaneously. The matrix is shown for subject-level ICA; for GICA, the temporal dimension is number of timepoints x number of subjects long (e.g., data are temporally concatenated).   (Yan et al., 2016) plug-in DPARSF. The first 10 time points were removed, and slice timing correction, realignment for motion correction, spatial normalization to MNI standard space were completed, and smoothed with 4.5 mm isotropic Gaussian Kernel. Temporal band pass filtering 10-150 mHz was applied to the data. After preprocessing, data were temporally concatenated across subjects, and two different ICA algorithms (InfomaxICA and FastICA) were used to identify spatiotemporal components. Amplitude of low-frequency fluctuation (ALFF) (Zang et al., 2007) was extracted as a measure of the spontaneous fluctuations in BOLD signal intensity across time that reflects the strength of within-network connectivity. ALFF values were correlated with a clinically relevant measure of cognitive functioning, the Montreal Cognitive Assessment (MoCA), to support the ecological utility of the proposed method. With a total possible score of 30, the MoCA assesses a range of cognitive abilities including memory, executive functioning, and attention. The MoCA is a brief screening tool for mild cognitive impairment, where scores lower than 26 indicate probable cognitive impairment. In addition, between-network functional connectivity was assessed via computing the Pearson correlations between the timecourses for each component pair for each subject and then assessing group differences using an unpaired t-test. Bonferonni correction was applied to correct for multiple tests (across network pairs) to achieve p < 0.05, corrected.

Simulated data
The simulated data contains 6 signal sources. The performance of FastICA with the simulated data is shown in Fig. 4. When the model order exceeds 6, the stability and convergence of the algorithm starts to decline, which suggests FastICA has identified 6 stable signal sources. For the InformaxICA algorithm, similar results are shown in Fig. 6, with similar degradation when the model order is larger than the number of sources. The average of intra-cluster similarities for the cluster results based on the component matrix, coefficient matrix and spatiotemporal tensors are shown in Fig. 5 for FastICA and Fig. 7 for InfomaxICA. Spatiotemporal tensor clustering was more sensitive to model order when decomposed with FastICA, and less sensitive to model order with the InfomaxICA algorithm.

In vivo clinical data -InfomaxICA performance
The InfomaxICA algorithm was run 50 times with multiple dimensionalities (2 < dimensionalities < 50). Then, the convergence and stability of the algorithm were evaluated, as shown in Fig. 8. When the model order is greater than 20-30, the convergence and stability of the algorithm begins to decline. For the in vivo data, considering the mean and SD of I q and the convergence, we selected results for model order 20 from the InfomaxICA algorithm for further investigation. Fig. 10 shows the spatiotemporal tensor I q values of the different components for model order 20. The figure shows that both the temporal courses and spatial maps of all subjects were stable under this model order. For the group ICA, the Kullback-Leibler information criterion (KIC) algorithm (Cavanaugh, 1999) suggests the optimal model order is 40, therefore the stability results using the component matrices, coefficient matrices, and spatiotemporal tensors for model order 40 and 20 were compared as shown in Fig. 11 (for six subjects). Neither the stability analysis based on the component matrix nor the stability analysis based on the coefficient matrix can evaluate the stability objectively by visual comparison. Tensor clustering represents a more concise and reasonable index of the stability of the ICA algorithm as it considers information from both the coefficient matrix and the component matrix. Via visual inspection, clustering based on the coefficient matrix, the component matrix and the spatiotemporal tensor for the two different model orders shows that the algorithm is more stable when the model order is 20.
As shown in Fig. 9, assessment of the the stability of InfomaxICA  applied to in vivo data shows that clustering based on spatiotemporal tensors are more sensitive to model order than clustering based on the component matrix (spatial information) or coefficient matrix (temporal information) alone. When the model order is between 2 and 7, both spatial maps and temporal courses are stable. The average of the intracluster similarities for coefficient, component, and spatiotemporal tensor results are almost equal to 1. However, when the model exceeds 40, there is increased instability in the coefficient matrix when compared to component matrix. Spatiotemporal tensor results are more sensitive to this variation, as information from both component and coefficient matrices are used to determine optimal model order.
For model order 20, the amplitude of low-frequency fluctuation (ALFF) for each component was extracted as a measure of local brain    function and correlated with MoCA scores. The ALFF of the DMN and salience networks showed a significant association with MoCA scores (Fig. 12). The functional connectivity of different networks was also evaluated. The connectivity between a frontoparietal network and a visual network was significantly lower (p = 0.04) in patients relative to HC; however these results did not remain significant upon correcting for all network pairs tested and are thus not discussed further. The results were consistent across 50 runs of the ICA decomposition.

In vivo clinical data -FastICA performance
The FastICA algorithm was run 50 times for each model order (2 < dimensionalities < 50). The convergence and stability of the algorithm are shown in Fig. 13, and optimal dimensionality was selected as 14. The average intra-cluster similarity index from cluster results based on the component matrix, coefficient matrix, and spatiotemporal tensors, are shown in Fig. 14. Similar to findings from InfomaxICA results, spatiotemporal tensor-based clustering was more sensitive to stability across model orders, which is consistent with the InfomaxICA results shown in Fig. 9. As shown in Fig. 15, nearly all rank-1 matrices were also stable under this model order. The stability results based on the component matrix, coefficient matrix, and spatiotemporal tensors with dimensionality 40 and 14 are shown in Fig. 16.
With a dimensionality of 14, ALFF of a posterior parietal component extracted by FastICA was significantly associated with MoCA scores, as shown in Fig. 17. Functional connectivity between the same two networks, frontoparietal and visual, was significantly lower (p = 0.02) in patients compared to HC, which is consistent with the results of In-fomaxICA, with results being consistent across 50 runs of the ICA decomposition. However, these results do not survive the correction for multiple network pairs and are thus not discussed further.

Discussion
While stability of spatial maps from ICA of fMRI data is used to inform model order selection post hoc, from a theoretical perspective, stability of the temporal courses from ICA may also provide important information regarding model order selection. The present study sought to investigate any insights into model order selection that may be gained by also considering temporal stability. Our study has revealed several insights: (1) hierarchical cluster analysis is a useful method for assessing ICA component stability generally (e.g., using any of the component, coefficient, or tensor matrices), (2) assessment of stability of component temporal courses provides important information as to the stability of ICA results, (3) the convergence of FastICA and InfomaxICA algorithms was related to model order, and (4) spatiotemporal tensor clustering was sensitive to model order effects. Fig. 10. In vivo results. The stability analysis results for different subjects for model order 20 using the InfomaxICA algorithm. All components are stably decomposed, with the Iq values of all subjects higher than 0.9 for every component. Fig. 11. In vivo results. The cluster results of component matrix, coefficient matrix and spatiotemporal tensors with model orders 40 and 20, using InfomaxICA. In each clustering result, clusters are indicated by black convex hulls and grey/black lines connect similar clusters. More compact clusters (with higher intra-class similarity) are more stable. G. Hu, et al. Journal of Neuroscience Methods 325 (2019) 108359 Furthermore, (5) components from model orders selected by considering both convergence and component stability are consistent with previous literature and associated with clinically relevant markers of pathology. The results of this study are impactful for a number of reasons.
Hierarchical cluster analysis provided important information about ICA decomposition performance in the current study. By evaluating the stability of and convergence of spatiotemporal results under different dimensionalities, we were able to accurately identify an optimal model order. Additionally, hierarchical clustering allowed us to compare the performance between different ICA algorithms, which can provide important information to investigators when making methodological decisions regarding ICA decomposition. For example, the FastICA converged faster than InfomaxICA, which was consistent across 50 runs in multiple dimensionalities. However, the InfomaxICA produced more stable decompositions compared to the FastICA algorithm, under the same model order with a trade-off of convergence at higher model orders. Assessment of stability when making methodological decisions would improve transparency of algorithm selection and replicability of findings.
The results of resting-state fMRI analysis support the conclusion that   the stability of the spatial and temporal data differ across model orders, with the temporal courses being much less stable than spatial maps. Clustering based on spatiotemporal tensors better captures the interplay between spatial and temporal stability and may prove to be superior for evaluating the stability and performance of the algorithm and determine optimal dimensionality in further investigations. The current study explored the stability and convergence under different model orders and ICA algorithms, a process that required significant time and computing resources. However, given the variability in convergence and component stability across model orders, testing multiple dimensionalities may be an important step in fMRI data analysis to improve methodological rigor. For each ICA decomposition, stability and convergence varied across model order. When the model order exceeded a certain range, temporal component instability increased. It is possible that this instability results from the splitting of larger networks into subcomponents, which may still be of interest to some researchers despite their instability. Studies which utilize large dimensionalities may want to carefully consider the effect of time course instability on results and utilize the proposed methodology to objectively assess the performance of their algorithms. While the current study used convergence and stability to assess ICA decomposition performance, we recognize there are other aspects of performance that have not been considered. Further research is needed to fully address the topic, and expand upon the current findings.
The findings in the in vivo clinical data analysis suggest that when spatiotemporal stability is considered in model order selection, clinically meaningful network components can be extracted. While incorporating both spatial and temporal stability into algorithm evaluation is sensible from a theoretical perspective, it was necessary to establish that selected components were still ecologically valid. While the decisions about dimensionality did not differ between spatial, temporal, and spatiotemporal decompositions, the inclusion of spatiotemporal stability may be more important for other clinical populations where time course variability is a marker of pathology (Malinen et al., 2010). Further research is needed to expand upon our findings and evaluate consistency across clinical populations.
In conclusion, for blind source separation (matrix decomposition and tensor decomposition) of fMRI data, the stability of the algorithm is a problem that deserves careful consideration. Both spatial and temporal aspects of resulting independent components provides meaningful information, so assessment of spatiotemporal stability, via a multidimensional rank-1 matrix, will provide more complete information regarding the balance between spatial and temporal stability than assessment of either alone. Hierarchical cluster analysis of spatiotemporal tensors is a useful technique for assessing the stability across model orders and assessing the performance of specific ICA decompositions. This is useful for investigators both when selecting model order posthoc and when making methodological decisions about which ICA algorithm is best suited for their data and research questions. Our Fig. 15. The stability analysis results of different subjects when the model order is 14 in the FastICA algorithm. For component#1-10, #12, #13, the Iq values of all subjects were higher than 0.9, so all the components are considered to be stably decomposed. For Component#11, because the stability of one subject is less than 0.9, the component will not be further analyzed. software is available at https://github.com/GHu-DUT/Tensorclustering.

Acknowledgements
This work was supported by National Natural Science Foundation of China (Grant No. 81471742 & Grant No. 91748105)

Appendix A. Temporal concatenated GroupICA
For multi-subject fMRI data analysis, we used GICA (Calhoun et al., 2001). First, data reduction was performed for each subject using PCA: Z m represents the m th subject's fMRI data after preprocessing. V m is the data reduction matrix, which is obtained from principal component analysis (PCA). D m is the reduced data for each subject. For GICA, the reduced data for each subject was concatenated temporally to form an aggregate reduced data matrix. The aggregate data matrix is then reduced again via group PCA: Reconstructing temporal courses: 1,2, ) contains temporal course information of each m th subject. Y represents the estimation of the source signals S, e.g., the spatial distribution of each component. In the case of temporal concatenated GICA, spatial distributions are assumed to be the same across subjects whereas temporal courses are different.
Appendix B. Rank-1 matrix ultimately decomposed by ICA Consider a linear signal mixing model: N R is the mixing matrix and ∈ × S R M are the source signals. R represents the number of source signals.
Next, consider the ICA decomposition model: R M is the estimate of the source matrix S. However, it is well known that ICA has magnitude and polarity indeterminacy (Hyvarinen, 1999 is true under the condition of global optimal solution (Cong et al., 2011). E.g, the rank-1 matrix E k from ICA is estimated without ambiguity. The proof is as follows: With the global matrix defined as: E i and E j are rank-1 matrices constructed from the temporal courses and the corresponding spatial distributions. With variance ambiguity of ICA algorithms (Hyvärinen and Oja, 2000), both temporal courses and spatial maps can be standardized. After standardization, the mean of temporal courses (a r ) and spatial maps (s r ) is zero and the variance of each is one. In order to accelerate the clustering speed and reduce the required computer memory, simplification of similarity matrix was used and shown as follow: SimA represents similarity matrix of temporal courses and SimS represents similarity matrix of spatial maps. By this way, it is proved that the similarity matrix of Tensor Clustering can be expressed as the Hadamard product of the similarity matrix of the temporal courses and that of spatial distribution.