Correcting physiological noise in whole-head functional near-infrared spectroscopy

BACKGROUND
Functional near-infrared spectroscopy (fNIRS) has been increasingly employed to monitor cerebral hemodynamics in normal and diseased conditions. However, fNIRS suffers from its susceptibility to superficial activity and systemic physiological noise. The objective of the study was to establish a noise reduction method for fNIRS in a whole-head montage.


NEW METHOD
We have developed an automated denoising method for whole-head fNIRS. A high-density montage consisting of 109 long-separation channels and 8 short-separation channels was used for recording. Auxiliary sensors were also used to measure motion, respiration and pulse simultaneously. The method incorporates principal component analysis and general linear model to identify and remove a globally uniform superficial component. Our denoising method was evaluated in experimental data acquired from a group of healthy human subjects during a visually cued motor task and further compared with a minimal preprocessing method and three established denoising methods in the literature. Quantitative metrics of contrast-to-noise ratio, within-subject standard deviation and adjusted coefficient of determination were evaluated.


RESULTS
After denoising, whole-head topography of fNIRS revealed focal activations concurrently in the primary motor and visual areas.


COMPARISON WITH EXISTING METHODS
Analysis showed that our method improves upon the four established preprocessing methods in the literature.


CONCLUSIONS
An automatic, effective and robust preprocessing pipeline was established for removing physiological noise in whole-head fNIRS recordings. Our method can enable fNIRS as a reliable tool in monitoring large-scale, network-level brain activities for clinical uses.

However, although fNIRS is attractive in clinical research (Chen et al., 2020a), its widespread use has been held back due to contamination of physiological noise in the recordings (Kirilina et al., 2013;Pinti et al., 2018;Tong, 2012). Firstly, due to the fact that the near-infrared light passes through the skin and skull before reaching the cerebral cortex, the fNIRS measurements are sensitive to absorption in the superficial layers of the head (Saager and Berger, 2005). Thus, the signal obtained from a regular long-separation (LS) channel (for example, 3 cm apart between a pair of source and detector in adults) is a mixture of absorptions by deep cerebral and superficial tissues (Saager and Berger, 2005;Takahashi et al., 2011). Furthermore, even in the cerebrally originated signals, there could be contaminations arising from physiological systemic activity, including changes in heart rate, blood pressure, breathing rate and volume, autonomic nervous system activity and movements (Kirilina et al., 2013;Sasai et al., 2011;Schecklmann et al., 2017;Tachtsidis and Scholkmann, 2016;Tong, 2010;Yuan et al., 2013). The superficial activity and physiological noise collectively affects the accuracy of fNIRS, leading to false positives and false negatives in the imaging of functional brain activities (Tachtsidis and Scholkmann, 2016).
Currently, denoising methods for fNIRS that are commonly used in the literature include temporal filtering as a minimal preprocessing method, general linear model (GLM) with short-separation (SS) channels (namely SS-GLM) (Gagnon et al., 2011), spatial filtering based on principal component analysis (namely PCA) (Virtanen et al., 2009;Zhang et al., 2005Zhang et al., , 2016, and PCA filtering integrated with SS channels (namely Sato method) (Sato et al., 2016). Given a sufficient sampling rate in the acquisition, a minimal preprocessing method that primarily relies on band-pass filtering is effective in removing cardiac pulsation (0.8-1.6 Hz) and respiration (0.2-0.6 Hz), but not capable to address the noise in low-frequency oscillations (0.04-0.2 Hz, LFO) and very-low-frequency oscillations (0.01-0.04 Hz, VLFO) (Stefanovska et al., 1999;Tong et al., 2011), which however overlap with the frequencies of resting and evoked neural activities. More advanced methods such as the SS-GLM method utilizes additional information from SS channel which has a source-detector distance smaller than 1 cm and is primarily sensitive to superficial tissues (Gagnon et al., 2012(Gagnon et al., , 2011Saager and Berger, 2005). The SS signals can be regressed from LS recordings to attenuate the noise originated from superficial tissues and physiological activity (Gagnon et al., 2011). In addition, there are other denoising methods based on SS channels, including temporal embedding canonical-correlation analysis (tCCA) (von Lühmann et al., 2019(von Lühmann et al., , 2020, Kalman filter (Dong and Jeong, 2018), etc. However, direct removal of SS recordings may be problematic. Theoretic simulations have demonstrated that SS recordings might contain contributions from deep brain tissues as well (Saager and Berger, 2005), although in much smaller magnitudes. More importantly, empirical evidence showed that tasks can induce block-wise modulations in SS channels (Kirilina et al., 2012;Takahashi et al., 2011). The task-related changes of oxygenated and deoxygenated hemoglobin (HbO and HbR) obtained in SS recordings have been found to be remarkably similar with those activations observed in regular LS channels, which implicates that directly regressing SS recordings from LS recordings may reduce the effect size of detecting task-related activations (Tachtsidis and Scholkmann, 2016). Meanwhile, another type of PCA-based denoising have limitations too. Out of multiple orthogonal components yielded by applying PCA on LS data, one or a subset of components should be selected and removed as noise to enhance brain activation patterns (Zhang et al., 2005(Zhang et al., , 2016. However, the selection of noise components in PCA-based methods typically relies on a manual decision and lacks a systematic approach that has fully validated the origins of physiological noise in the removed components. Due to the intrinsic correlation between noise and neural-associated signals (e.g., the SS and LS signals exhibit similar task-modulated changes), PCA-based methods have limitations in separating noise from signals of interests. Therefore, removing too many principal components is possible and can result in a substantial reduction of activation (Boas et al., 2004;Zhang et al., 2005Zhang et al., , 2016. Other methods based on the decomposition of the multi-channel fNIRS data, such as the independent component analysis (ICA) (Kohno et al., 2007), also suffer from similar limitations when manually choosing noisy components to remove. Thus, in both PCA-and ICA-based methods, the preprocessing procedure is not fully automatic and their performance is not satisfactory (Watabe and Hatazawa, 2019). In the meantime, applying PCA to short-separation recordings is suggested as an effective strategy to overcome the difficulties in LS-based PCA and SS-GLM methods, while exploiting the regional homogeneity in SS recordings (Sato et al., 2016).
Furthermore, physiological fluctuations in cardiac pulsation, respiration, and voluntary head and jaw movements can affect the fNIRS measurements (Kirilina et al., 2013;Sasai et al., 2011;Tachtsidis and Scholkmann, 2016;Tong, 2010;Yuan et al., 2013). While filtering is commonly used to filter out the frequency range regarding pulse and respiration, non-stationary fluctuations that are related to slow changes in pulse and respiration rate and volumes have been shown to impact hemodynamic measurements in fMRI (Birn et al., 2006;Chang et al., 2009b;Hu et al., 1995;Power et al., 2012;Shmueli et al., 2007;Van Dijk et al., 2012;Wise et al., 2004), and yet to be fully considered in fNIRS denoising procedure.
In addition to the above-mentioned limitations in the fNIRS preprocessing methods, an unmet need so far is a denoising method for whole-head fNIRS recordings. Most of the existing methods were developed in a partial montage that only includes a relatively small number of channels (Gagnon et al., 2011;Zhang et al., 2007Zhang et al., , 2005, or covers only part of the cortex (Sato et al., 2016;Zhang et al., 2016). Despite of some recent efforts (Noah et al., 2021;Santosa et al., 2020), these methods have not yet been fully evaluated in a whole-head montage at a complex task when current activations from multiple, distant regions arise, which however is especially critical for imaging resting-state brain networks. Till now, it remains unknown whether the denoising methods for partial montages would still be effective in a whole-head montage, considering that noise in an expanded recording range will challenge the underlying assumptions employed in these methods. Blood flow in the skin tissues covering the scalp is known to be heterogenous (Kirilina et al., 2013;Schuenke et al., 2010;Tong et al., 2010). Facial muscles might be another contributor to physiological noise when the fNIRS montage extends to the temporal region (Novi et al., 2020;Schecklmann et al., 2017). Therefore, a denoising strategy that is proven effective to the whole-head montage is needed to enable fNIRS for studying large-scale brain networks, especially in diseased conditions (Irani et al., 2007;Leff et al., 2011;Obrig, 2014).
In the present study, we proposed an automatic denoising method for whole-head fNIRS recordings (as shown in Fig. 1). Our method namely PCA-GLM integrates principal component analysis and general linear model regression based on long-and short-separation channels and auxiliary measurements. Experimental data acquired from thirteen health individuals during a visually cued motor task were used to evaluate whether activations in motor and visual areas are both detectable. Then our PCA-GLM method was compared with four other commonly used methods, i.e. minimal processing of filtering only (namely No Correction), SS-GLM (Gagnon et al., 2011), PCA (Huppert et al., 2009;Zhang et al., 2005), and the Sato method (Sato et al., 2016). Performance of all methods was assessed using quantitative metrics, including contrast-to-noise ratio, within-subject standard deviation and adjusted coefficient of determination.

Participants
This study was approved by the University of Oklahoma Health F. Zhang et al. Sciences Center Institutional Review Board. Written consent was obtained from all participants prior to the study. All the procedures were carried out in accordance with the IRB guidelines. Thirteen healthy participants without any neurological or neuropsychiatric disorders were enrolled in this study (five females, 25-50 years old). Participants received financial compensation for their participation. Data of three subjects were excluded from data analysis due to bad data quality.

Data acquisition
Simultaneous fNIRS and peripheral measurements were collected in all participants. In addition, EEG recordings were recorded concurrently but were not used in the present study. fNIRS data were acquired using a NIRScout system (NIRx GmbH, Berlin, Germany). As illustrated in Fig. 1, The acquisition consisted of 34 dual-wavelength (760 nm and 850 nm) sources, 31 LS detectors and 8 SS detectors, which jointly formed 109 LS channels and 8 SS channels evenly distributed over the entire head, following a similar montage of our previous study (Chen et al., 2020b). The SS channels had source-detector separations of 8 mm. As suggested in fMRI, the sampling frequency should be set sufficiently high to avoid aliasing effects of respiratory and cardiac activity (Liu, 2016). While fNIRS has the advantage of sampling faster compared with fMRI, its sampling frequency decreases as the number of light sources used increases. In order to maintain a high sampling rate while using a large number of sources, we employed parallel source illumination in the NIRScout system, which illuminated multiple sources in each single time step. To avoid optical cross-talk, the optimal source illumination sequence was manually generated based on the criterion that the distances between sources illuminated during the same time step should be larger than 9 cm so that each detector does not receive light from multiple sources simultaneously. 34 sources were illuminated in 10 time steps, which resulted in a sufficient sampling rate of 6.25 Hz. The sensitivity profile of the fNIRS probe to cortical absorption changes obtained by Monte Carlo simulation with photon propagation in Atlas-Viewer (Aasted et al., 2015) is shown in Fig. 1. The number of photons in the simulation was set as 1e7 (Aasted et al., 2015). The fNIRS optodes and EEG electrodes were placed on an elastic cap (EASYCAP GmbH, Herrsching, Germany), according to the international 10-5 system (Oostenveld and Praamstra, 2001). Additionally, auxiliary data of acceleration, respiration, cardiac pulses were recorded by a three-dimensional accelerometer, a pneumatic respiration belt and a photoplethysmography respectively which were connected to the auxiliary input ports of the actiCHamp system (BrainProducts, München, Germany). Auxiliary data were sampled at a frequency of 500 Hz.

Experimental paradigm
Participants were instructed to clench the right hand guided by visual cues at a comfortable sitting position in a dark room. Each participant performed two repeated sessions of clenching at 1 Hz and two sessions of clenching at 2 Hz. Within a session, there were an initial resting period (30 s) and six blocks. Each block contained a task period (20 s) and a resting period (30 s). Fig. 2 shows the schematic diagram of the experimental paradigm. E-prime (Psychology Software Tools, Sharpsburg, PA) software was used to display instructions and visual cues on the monitor during the experiments.

fNIRS preprocessing
fNIRS data were automatically preprocessed using proprietary scripts in MATLAB® (The Mathworks, Natick, MA) in conjunction with the HOMER2 toolbox (Huppert et al., 2009). The steps of the pipeline are illustrated in Fig. 3. Specifically, the preprocessing pipeline included: 1) converting raw data to optical density (OD); 2) computing power spectral densities using the Welch's method (a window length of 60 s and 50% overlap) and excluding bad channels that showed no heartbeat frequency peak (0.8-1.6 Hz); 3) bandpass filtering OD with 0.008-0.2 Hz; 4) converting filtered OD to relative concentration changes of oxy-hemoglobin (ΔHbO) and deoxy-hemoglobin (ΔHbR) based on the modified Beer-Lambert law (MBLL) (Scholkmann et al., 2014). The differential pathlength factors were selected as 7.25 and 6.38 for 760 nm and 850 nm, respectively (Herold et al., 2018). The molar extinction coefficients at 760 and 850 nm were 645.5 cm − 1 /M and 1097.0 cm − 1 /M for HbO and 1669.0 cm − 1 /M and 781.0 cm − 1 /M for HbR, respectively (Piper et al., 2014).

Identifying physiological noise based on principal component analysis of long-and short-separation data
After preprocessing of fNIRS data, we developed a novel approach (namely PCA-GLM) to remove the physiological noise by utilizing PCA and GLM based on 109 LS and 8 SS channels that evenly covered the whole head. Firstly, PCA was applied to LS data to decompose signals into multiple components. Topographic distributions were displayed over whole-head LS channels ( Fig. 4A) and assessed by a metric called the coefficient of spatial uniformity (CSU) (Kohno et al., 2007). Because spatial uniformity indicates superficial skin responses (Zhang et al., 2005), a principal component with the highest value of CSU was identified and abbreviated as PC-LS hereafter. After determining the PC-LS component, the time course of PC-LS was not directly removed from the multi-channel LS data; instead, the PC-LS was used to select a component from the SS measurements which are more directly sensitive to the absorption by superficial tissues (Saager and Berger, 2005). Specifically, a total of eight-channel preprocessed SS data were subject to a separate principal component analysis, and the SS component that had the highest temporal correlation with PC-LS was chosen as the skin-response component (abbreviated as PC-SS hereafter). The time course of PC-SS was then used as a nuisance regressor to be removed by GLM. Exemplar time courses of PC-LS and PC-SS are plotted in Fig. 4B and C.

Removing physiological noise using linear regression
A general linear model was configurated per each session to remove physiological noise. Prior to GLM, auxiliary data of acceleration, respiration, cardiac pulsation were processed by a band-pass filter with 0.008-0.2 Hz and further detrended by a 3-order polynomial drift. The time course of PC-SS, the detrended auxiliary data (acceleration, respiration and cardiac pulsation) and a 3-order polynomial drift were included in the design matrix as nuisance regressors. The main effect was accounted for by a hemodynamic response function (HRF) regressor, which was derived by convolving the block design with a canonical two-gamma HRF model (Lindquist et al., 2009). Fig. 5 shows an example of detrended nuisance regressors in a single session together with an HRF predictor.

Comparison with other preprocessing methods
Our proposed PCA-GLM method was compared with four other commonly used preprocessing methods, i.e. a minimal preprocessing of temporal filtering (namely No Correction), the PCA on SS channels (namely Sato method) (Sato et al., 2016), the spatial filtering method (namely PCA) (Huppert et al., 2009;Zhang et al., 2005) and the GLM-based method that directly regresses the SS from LS recordings (namely SS-GLM) (Gagnon et al., 2011). To implement the PCA method (Zhang et al., 2005), we used the enPCAFilter (nSV = 3) function in Homer2 toolbox (Huppert et al., 2009), which used the entire session of both task and resting data to calculate the PCs. In the SS-GLM method, we used the nearest neighbor SS channel to each LS channel as a regressor representing physiological noise, together with a 3-order polynomial drift and an HRF regressor. In the Sato method, the first PC of SS recordings were used as a nuisance regressor along with a 3-order polynomial drift and an HRF regressor in a general linear model to denoise LS recordings.

Evaluation of denoising performance
To quantify the performance of the proposed PCA-GLM method and four other methods, we evaluated three metrics including contrast-tonoise ratio (CNR), within-subject standard deviation (SD) and adjusted coefficient of determination (adjusted R 2 ).  (1) The contrast-to-noise ratio (CNR) is calculated as the difference between the mean amplitudes during the task and the rest period normalized by the summation of variances at task and rest periods (Cui et al., 2010;Nguyen et al., 2018;Zhang et al., 2005). The CNR is defined as follows: where "task" denotes the task period (from 6 s to 16 s), and "rest" denotes the resting period (from − 5 s to 0 s) with respect to the onset of each task block (t = 0 s). A high CNR value indicated that the ratio of task-evoked signal to noise is high.
(2) The within-subject standard deviation (SD) (Brigadoi et al., 2014) is calculated as the standard deviation across blocks of a single session when calculating the block averages (defined in Section 2.9). (3) The adjusted coefficient of determination (adjusted R 2 ) is calculated to evaluate the goodness of fitting by comparing the preprocessed time courses and the task design (Sato et al., 2016). A standard GLM with polynomial drift and HRF regressor was applied to the time courses processed by the No Correction method and those by the PCA method, respectively.
The metrics of CNR, within-subject SD and adjusted R 2 were reported in two representative channels over the motor and visual regions respectively, which were determined by finding the channel that had the highest beta value associated with the task effect in GLM. To assess the performance at a group level, metrics obtained from all subjects were pooled and compared using a two-tailed paired t-test with the Benjamini-Hochberg method to correct multiple comparisons (Benjamini and Hochberg, 1995).

Visualization of task-related activations in whole-head montage
To visualize fNIRS in detecting task-related activations, we calculated time courses of relative changes that are normalized to the variances of task and rest periods, consistent with the definition of contrastto-noise ratio. Specifically, where y and y normalized denoted the time courses of a fNIRS channel before and after normalization, "task" denotes the task period (from 6 s to 16 s) and "rest" denotes the resting period (from − 5 s to 0 s) with respect to the onset of task blocks (t = 0 s). In each session, the normalized relative changes were segmented into task blocks, aligned at the beginning of task blocks, and then averaged as the block average per  session. At the group level, the block averages of all sessions were averaged. Channels representing the contralateral motor cortex, the ipsilateral cortex and the bilateral visual cortex were illustrated. At the group-level, in order to determine whether the activation was significant, a one-sample, a two-sided t-test was performed at each time points across all subjects' sessions, separately for our GLM-PCA method and No Correction method. In addition, a paired two-sample, two-tailed t-test was performed to evaluate whether GLM-PCA and No Correction methods yielded different time courses. The resulted p values were corrected for multiple comparisons using the Benjamini-Hochberg procedure (Benjamini and Hochberg, 1995).
In addition, to visualize the whole-head topography in detecting task-related activations, a group-level one-sample, two-tailed t-test vs. 0 was performed on all channels across all subjects' sessions, based on the beta values associated with the task effect in GLM. The resulted p values were corrected for multiple comparisons using the Benjamini-Hochberg procedure (Benjamini and Hochberg, 1995).

Identification of physiological noise from LS and SS fNIRS channels
We have designed a novel two-step PCA approach to identify physiological noise from 109 LS and 8 SS fNIRS recordings that evenly covered the whole head. While our method was applicable to both HbO and HbR, we summarized our findings of HbO in the main text and included results of HbR in the Supplemental Material. Firstly, a PCA decomposition was performed on the LS recordings and the principle component associated with the highest CSU value (Kohno et al., 2007) was selected as PC-LS. Fig. 4A shows the topographies of the identified PC-LSs in two individual subjects and grand average, respectively. The CSU values corresponding to the exemplar sessions in Subject 1 and Subject 2 are 2.06 and 2.66, respectively. Also, in all individuals, the PC-LS presents a high spatial uniformity with a CSU of 1.74 ± 0.77 (mean ± SD), and ranging from 0.64 to 3.62. The values are consistent with our previous finding (Zhang et al., 2019) and also comparable to those reported in Kohno et al. (2007), despite that in the current study a whole-head montage of more than three times channel numbers was used. Furthermore, we found that the PC-LS accounted for 35.94 ± 18.01% of the total variances in long-distance channel data across all sessions. Consistently described in previous fNIRS studies (Zhang et al., 2005), such a component that has a homogenous spatial distribution in regular LS channels is recognized for representing noise originated from superficial skin responses.
Next, a separate PCA decomposition was performed on 8 SS channels that evenly distributed over the scalp. The principal component derived from SS recordings which has the highest temporal correlation with the time course of previously identified PC-LS was selected as the noise component of all SS data (namely PC-SS). This procedure was operated under the considerations that measurements of multiple LS and SS channels are sensitive to a common physiological noisy component originated from the superficial skin tissues Berger, 2005, 2008). Indeed, Fig. 4B and C show representative time courses of PC-LS and PC-SS from two representative subjects. Noteworthy, PC-LS and PC-SS exhibited a similar profile. Across all subjects and sessions, the correlation between PC-LS and PC-SS was 0.67 ± 0.14 (mean ± SD).
Aside from the temporal similarity between PC-LS and PC-SS, they exhibited important differences. Fig. 4B and C show that PC-SS have more fluctuations in the same time window than the PC-LS, suggesting that PC-SS contains higher frequency oscillations. Furthermore, although PC-LS exhibits a uniform distribution, the PC-LS shows an increase in HbO (and a decrease in HbR, data not shown) during the motor task period, which are similar to the task-related changes in a regular LS channel over the motor area (e.g., the black color curves in Fig. 4B and C from the same session). The observation is consistent with previous studies which reported that although PC-LS captures noise of superficial contributions, it is not free of task-modulated changes (Sato et al., 2016). Therefore, in order to avoid attenuating the task-related changes in LS channels, our processing pipeline removed the PC-SS rather than the PC-LS. Other nuisance regressors included the auxiliary physiological measurements and 3-order polynomial drift (shown in Fig. 5).

Evaluation of the proposed denoising method
We have applied our proposed method (PCA-GLM) and four established preprocessing methods (i.e., No Correction, PCA, SS-GLM and Sato Method) to the experimental dataset acquired in a group of healthy subjects. Regarding performance metrics of CNR, within-subject SD and adjusted R 2 , two-tailed paired-sample t-tests were conducted to compare our PCA-GLM method with four other methods. Benjamini-Hochberg method was used to address the multiple comparison problem (Benjamini and Hochberg, 1995). All results shown here were based on HbO contrast unless otherwise specified. Fig. 6 demonstrates that our PCA-GLM method improves upon the minimal preprocessing method with no correction of superficial noise, yielding significantly higher CNR, lower within-subject SD and higher R 2 (q < 0.001). Compared with four established denoising methods (No Correction, Sato method, PCA and SS-GLM), our PCA-GLM resulted in the highest CNR and R 2 in both motor and visual channels and significantly outperformed the other methods. In terms of the within-subject SD, PCA-GLM shows good reliability with the lowest value than other methods, except the PCA. However, in the PCA method, the low within-subject SD was accompanied by low CNR and low adjusted R 2 , which were attributed to the overall small magnitudes of time courses after removing excessive components.
Noteworthy, our PCA-GLM method is completely automatic, including the determination of PC-LS and PC-SS as well we as the GLM regression. In the four other methods that were compared, the PCA filtering method was configurated to be automatic by mandating a fixed number of components to be removed (n = 3). Fig. 7 shows the automatic denoising outcomes in a representative subject (Subject 1) by applying our PCA-GLM method in a total of 109 long-channel recordings that covered the whole head. As our study used a visually cued motor task, activations in both the motor and visual areas were expected. Whole-head topographies of HbO activations across 5-15 s in the single trials and block averages of the normalized relative changes are shown in Fig. 7A. In addition, the single-trial and blockaverage time courses of the normalized relative changes from selected channels in the motor and visual areas are illustrated in Fig. 7B. After processing by PCA-GLM, a strong and focal activation in the contralateral motor area became evident in both the single-trial and the blockaverage topography. Other identified areas included the ipsilateral sensorimotor area with a focus shifted towards the posterior direction than the contralateral motor area, as well as the bilateral visual area that appeared to be symmetrically activated. In the time courses of single trials and block averages, the PCA-GLM yielded an overall stronger activation in ΔHbO during the task durations, compared with the minimal processing.

Whole-head topography of denoised fNIRS data
Group results across all subjects are shown in Fig. 8. The grand average of ΔHbO time courses from the motor and visual areas is shown in Fig. 8A. In multiple channels over the contralateral motor cortex, ipsilateral sensorimotor cortex and bilateral visual cortices, significant activations during the tasks were detected by our PCA-GLM method (Fig. 8A). Compared with the results from minimal processing with no correction of SS channels, our method yielded a higher magnitude of HbO increases during the task periods and more time points with significant activations from rest periods (q < 0.05, multiple comparisons corrected), which is consistent with a group-level increase of CNR (Fig. 6A). A group-level topography of activations is shown in Fig. 8B. A t-test on the task-association effect revealed significant activations in the channels from the contralateral motor, the ipsilateral premotor, and the bilateral visual areas (q < 0.05, after applying the Benjamini-Hochberg method with a false discovery rate of 0.05).

Discussions
fNIRS has seen rapid development and increasing use in studying the human brain under normal and diseased conditions (Chen et al., 2020a;Pinti et al., 2018;Wheelock et al., 2019). However, fNIRS suffers from its susceptibility to the superficial tissues and systemic physiological noise associated with the cardiac pulse, respiration and movement (Kirilina et al., 2013;Schecklmann et al., 2017;Tong et al., 2011). The presence of physiological noise confounds the accuracy of fNIRS for monitoring brain activation, with false positive and false negative results in images (Tachtsidis and Scholkmann, 2016). Therefore, in order to tap the full potential of fNIRS, it is essential to correct these confounding noise from fNIRS signals.
In the current study, we proposed a novel denoising method namely PCA-GLM for whole-head fNIRS recordings. Our method was applicable and effective in a montage of 109 regular LS and 8 SS channels that evenly covered the scalp. To our knowledge, our study is the first to report a preprocessing method that integrates long-and short-separation channels and auxiliary measurements to improve fNIRS signals in a Fig. 6. Comparison of denoising performance. CNR, adjusted R 2 and within-subject SD in HbO of the motor channel (left column) and visual channel (right column) were compared between PCA-GLM and No Correction, Sato method, PCA or SS-GLM. The asterisks above lines indicate statistically significant differences between different methods (* q < 0.05, ** q < 0.01, *** q < 0.001, after applying Benjamini-Hochberg method). The error bars indicate the standard errors across sessions.
whole-head montage. Especially, our study has evaluated the proposed preprocessing method in a complex task and for the first time reported effective denoising outcome of concurrent visual and motor activations in a whole-head montage. Results demonstrated that, during the task-on periods of a visually cued motor task, oxyhemoglobin activities in the contralateral primary motor, ipsilateral premotor/sensorimotor, and  The red and blue asterisks indicate the time points with significant activations (q < 0.05 after Benjamini-Hochberg correction) for the block averages produced by PCA-GLM and No Correction, respectively. The dark asterisk indicates the time points with significant differences between PCA-GLM and No Correction against 0 (q < 0.05 after Benjamini-Hochberg correction). (B) Topography of group-level activations associated with the task. The shown t values were from the t-test across all subjects. The dash circles indicate the LS channels that were significantly activated (q < 0.05 after Benjamini-Hochberg correction). bilateral primary visual areas were increased relative to the task-off period (shown in Figs. 7 and 8). Because subjects performed hand clenching with visual cues in the task-on blocks and rested with a grey screen in the task-off blocks, concurrent activations in the motor and visual areas were consistent with subjects' behavior. This finding of contralateral activation in primary motor and ipsilateral activation in sensorimotor areas is in agreement with the hemodynamic response previously observed in fNIRS (Franceschini et al., 2003Huppert et al., 2006;Leff et al., 2011) and fMRI studies Yuan et al., 2010Yuan et al., , 2011. In addition, activation in the primary visual cortex is also consistent with the previous report of fNIRS covering the visual regions (Chen et al., 2015;Franceschini et al., 2006;Jasdzewski et al., 2003;Liao et al., 2010;Obrig et al., 2002;Putze et al., 2014). Importantly, our study using a whole-head montage has reported concurrent activations in channels over different and distant brain lobes, i. e., motor and visual activations in a visuomotor task.
In order to remove the confounding physiological noise in fNIRS, our study utilized several auxiliary measurements in addition to the wholehead fNIRS, including multiple SS channels to capture superficial signals and auxiliary measurements of acceleration, respiration and cardiac pulsation. Based on these physiological measurements, we proposed a novel way of integrating two-step PCA and GLM to remove the systematic noise embedded in regular LS channels. We have evaluated the denoising performance of our method with four other commonly used methods for processing fNIRS signals, including the minimal preprocessing with no correction of superficial signals, PCA-based spatial filtering (Zhang et al., 2005;Huppert et al., 2009), GLM with SS channels (Gagnon et al., 2011) and the Sato method that combined PCA and SS channels (Sato et al., 2016). Although many studies so far have evaluated various fNIRS methods in denoising performance (Pinti et al., 2018;Santosa et al., 2020;Sato et al., 2016;von Lühmann et al., 2020;Wyser et al., 2020;Zhang et al., 2009Zhang et al., , 2015Zhang et al., , 2005, the data used in those studies (including synthetic data) only contains a partial montage of the head, e.g., covering only the motor cortices (Gagnon et al., 2011;Santosa et al., 2020;Sato et al., 2016;Wyser et al., 2020;Zhang et al., 2005), prefrontal cortex (Dong and Jeong, 2018;Pinti et al., 2018;Sato et al., 2016), or visual cortex (von Lühmann et al., 2019Zhang et al., 2009Zhang et al., , 2015. When evaluating these methods, experimental data or synthetic data with one focal activation are commonly used, even at a wide montage (Noah et al., 2021;Zhang et al., 2016). Importantly, our study has extended the algorithms of PCA, SS-GLM and the Sato method to a much larger set of channels evenly covering the whole head, and compared the denoising performance with our PCA-GLM method. Since PCA filtering and the Sato method reply on a homogenous global pattern in capturing superficial noise, their performance at whole-head montage was not guaranteed, considering that emerging evidence has suggested the contributors to physiological noise are heterogenous rather than homogenous (Erdogan et al., 2014;Kirilina et al., 2012;Schecklmann et al., 2017;Sutoko et al., 2019;Wyser et al., 2020). In the meanwhile, although SS-GLM allows heterogeneous SS signals in the denoising scheme, the direct removal of SS time series may reduce the effect size in regular LS fNIRS signals, since task-related activities are still detectable in superficial signalstherefore the denoising performance of SS-GLM at whole-head montage remained unknown. Evaluation of our proposed method and four other common methods showed that PCA-GLM resulted in superior CNR, within-subject SD and adjusted R 2 in experimental whole-head recordings from a visually cued motor task. The CNR and adjusted R 2 yielded by our PCA-GLM method was the highest among all methods in both motor and visual channels, and significantly outperformed all other methods. The superior performance of CNR indicates that our denoising method is valid and can improve the capability of fNIRS in detecting cerebral activations by effectively removing physiological noise. Furthermore, the experimental validation of concurrent motor and visual activities in our study suggests the feasibility of fNIRS in imaging large-scale neural networks in human subjects, such as the default mode network and attention network, which are composed of multiple nodes in distant lobes and are implicated in many neurological and neuropsychiatric diseases (Aihara et al., 2020;Duan et al., 2012;Li et al., 2018;Lu et al., 2010;Zhang et al., 2010).
Another benefit of removing physiological noise is the increased reliability in fNIRS signals, which is evidenced by the smaller withinsubject SD of our PCA-GLM method as compared with the No Correction (q < 0.001 for both motor and visual channels), SS-GLM (q = 0.07 for the motor channel) and Sato Method (q = 0.07 for the motor channel), except for the PCA method. However, in the PCA method, the low within-subject SD was accompanied by low CNR and low adjusted R 2 , which were overall attributed to the small magnitudes of time courses after removing excessive components. The time courses of tasks have further shown that our method could yield more data points with activations and yield significantly stronger activations at sensorimotor and visual cortices, compared with No Correction (Fig. 8). Furthermore, single-trial time courses processed by our method have shown prominent task-engaged activations in both the motor and visual channels (Fig. 7B), especially with higher and detectable amplitudes than the resting period. These findings further suggest that the fNIRS might be applicable in brain-computer interface studies that rely on the detection of single-trial activities (Coyle et al., 2007;Hong et al., 2015;Naseer and Hong, 2015). Because fNIRS is noninvasive, portable and robust to head movement (Carius et al., 2016;Herold et al., 2018;Perrey, 2008), it offers a complementary advantage as a sensing modality for developing neural interfaces (He et al., 2020).
Noteworthy, our denoising method is implemented as an automated pipeline for whole-head fNIRS recordings. Previously, studies employing PCA or ICA for denoising fNIRS data have relied on manual criteria, either choosing noisy components by visual inspection or manually setting a certain threshold to determine noise components. However, in our proposed PCA-GLM method, the selection of the noise component in the LS channels is automatically determined by choosing the SS component that resembles the LS component in the highest temporal correlation. Therefore, with the formulation of nuisance regressors, the entire preprocessing pipeline becomes automatic, which enables efficient batch processing in clinical applications of fNIRS.
In comparing our method with other commonly used methods, our PCA-GLM method contains a two-step PCA approach to derive the noise component, which differs from existing methods that performed PCA only on the LS channels (Zhang et al., 2005), or only on the SS channel (Sato et al., 2016). Previously, PCA-based filtering has been exploited to remove the global hemodynamic component by decomposing regular LS recordings only. However, without directly measuring superficial signals from SS channels, the decomposed components from LS data alone are insufficient in delineating the noise contribution. As shown in Fig. 4B and C, the time courses of PC-LS exhibited a task-alike modulation, in a very similar manner with the LS channel of interest located over the motor area. Therefore, directly removing a PC-LS that carries a task effect will attenuate the effect size in these PCA filter methods. Similarly, a single SS channel which is located near the LS channel might also carry the task effect. Therefore, in our developed method, we used PC-SS as a regressor to regress out the physiological and superficial noise in fNIRS signals. Our method derived the noisy component via PCA on a total of eight SS channels evenly distributed over the scalp, motivated by our previous findings that PC-SS exhibits a high correlation with PC-LS and that PC-SS is primarily contributed by superficial noise (Zhang et al., 2019). Therefore, the PC-SS was removed, rather than PC-LS or single SS channel, which yielded a significantly improved CNR, as shown in Fig. 6.
Our approach to derive the noise component from SS channels also differs from previous methods that utilized PCA only on the SS channels, e.g., the Sato method (Sato et al., 2016). In the method by Sato et al., PCA was performed on SS recordings and the 1st PC was chosen. However, the 1st PC of SS does not guarantee the component represents mostly the physiological noise, even though the 1st PC captures the highest portion of the variance in multi-channel recordings. Supplemental Figs. 3 and 4 compare our PCA-GLM method without auxiliary data and the Sato method. When deliberately not including auxiliary regression, PCA-GLM performed marginally better than the Sato method in both the visual and motor channels. In addition, when including the auxiliary data, our PCA-GLM method has yielded a significantly higher CNR and goodness of fit (adjusted R 2 ) than the Sato method (shown in Fig. 6). We further examined whether the 1st PC of SS showed the strongest correlation with the physiological noise from LS recordings (i.e., PC-LS), and found that only in 33 out of 40 sessions the 1st PC of SS was the designated noise component, PC-LS. In the representative cases when our PCA-GLM method identified a different PC as noise than the Sato method did (Supplemental Fig. 3), the PCA-GLM without auxiliary data has resulted in higher activations in the time courses of the contralateral motor channel and bilateral visual channels, while the ipsilateral motor channel showed a lower level of activity. Since this was a visually cued right-hand movement, the results from the PCA-GLM are consistent with the physiology of the motor system and visual system. Especially, these two cases highlighted the advantage of our PCA-GLM method over the Sato method in concurrent motor and visual activations, when the 1st PC of Short channels may not always be the best noise component to remove. Therefore, our approach of choosing the PC-SS that has the highest correlation with PC-LS rather than the 1st PC of SS recordings provides a more robust choice for capturing the physiological noise.
In the current study, our method removes one noise component that is spatially uniform and commonly reflected in LS and SS measurements (CSUs of PC-LS are 1.74 ± 0.77; temporal correlations between PC-LS and PC-SS are 0.67 ± 0.14), which has a reliable origin from superficial noise (Kohno et al., 2007;Tong, 2010;Zhang et al., 2005). Importantly, our method has developed a framework where more than one noise components from the whole-head LS and SS recordings can be derived and removed, such as originated from temporal muscles. However, future studies are warranted to validate the origins of additional noise components.
Furthermore, our study utilized the auxiliary recordings of pulse, respiration and movement. These additional nuisance regressors address the fluctuations induced by pulses, respiration and movement. Results in the Supplemental Fig. 1 demonstrated the benefit of including the auxiliary data in our PCA-GLM method, resulting in higher CNR, better fitting and lower variance. Notably, this benefit of denoising was achieved by including zero-shift time courses of auxiliary measurement, although studies have shown that a time-embedding approach (von Lühmann et al., 2020) or considering a response function to the respiration (Birn et al., 2008) or cardiac pulses (Chang et al., 2009a) might further capture their impact on cerebral hemodynamics. While these fluctuations are typical of less concern to a task in block design, they have been suggested to bias the organization of large-scale networks, especially in the resting brain (Mesquita et al., 2010;Yuan et al., 2013).

Conclusions
In summary, we have developed a novel denoising method to remove the physiological noise in whole-head fNIRS recordings. Our approach of identifying and removing noise based on long-and short-seperation channels and auxiliary measurements is completely automatic without manual maneuver. The results demonstrated that the proposed pipeline is capable of detecting concurrent and distant brain activations in a whole-head fNIRS montage, with improvements in detectability and reliability upon established methods. Results suggest that our denoising method for whole-head fNIRS can accelerate the usage of fNIRS in neuroscience research and clinical care, especially in studying largescale brain networks in normal and diseased conditions.

Funding
This work was supported by the National Science Foundation RII Track-2 FEC 1539068, Oklahoma Science and Technology Research and Development HR16-057 and Institute for Biomedical Engineering, Science and Technology at the University of Oklahoma.

CRediT author contribution statement
FZ developed the method, collected and analyzed the data and wrote the manuscript. DC, AFK and YC collected the data and wrote the manuscript. LD designed the study and wrote the manuscript. HY designed the study, analyzed the data and wrote the manuscript.