Brain segmentation, spatial censoring, and averaging techniques for optical functional connectivity imaging in mice

: Resting-state functional connectivity analysis using optical neuroimaging holds the potential to be a powerful bridge between mouse models of disease and clinical neurologic monitoring. However, analysis techniques speciﬁc to optical methods are rudimentary, and algorithms from magnetic resonance imaging are not always applicable to optics. We have developed visual processing tools to increase data quality, improve brain segmentation, and average across sessions with better ﬁeld-of-view. We demonstrate improved performance using resting-state optical intrinsic signal from normal mice. The proposed methods increase the amount of usable data from neuroimaging studies, improve image ﬁdelity, and should be translatable to human optical neuroimaging systems.


Introduction
Optical functional neuroimaging holds promise to link mouse models of neurological disease to the insights about human neuroscience gained from functional magnetic resonance imaging (fMRI). A major analysis tool in this field is resting-state functional connectivity, which enables mapping of distributed brain networks using correlated hemodynamics in the absence of tasks [1]. The ability to assess brain functional integrity without task-or stimulus-paradigms is well-suited for both clinical populations [2] and preclinical mouse models [3]. Recently, resting-state functional connectivity analysis has been adapted for use with optical intrinsic signal (OIS) imaging [4,5] and for fluorescence imaging using voltage-sensitive dyes [6] and geneticallyencoded calcium indicators [7][8][9]. These techniques, in turn, are stimulating development of new imaging biomarkers of neurologic disease in preclinical models [10][11][12][13].
While resting-state optical imaging algorithms are informed by fMRI, direct translation of the entire fMRI processing stream is not possible. For example, concurrent anatomic imaging and whole-brain coverage in MRI enables advanced brain segmentation techniques [14][15][16][17]. Furthermore, in MRI, once the cortical surface is identified and the data transformed to an atlas, then analysis across subjects is relatively simple. Since every location in the brain is sampled in every subject, the data can be averaged or concatenated. Conversely, no reliable structural data exists in optical imaging with which to segment the data. For optical neuroimaging in mice, the images consist of the dorsal surface of the brain as well as surrounding tissue (e.g., skull, overlying veins, skin, and hair). In the existing literature, to segment the brain from these other tissue types, a single imaging frame is viewed, and the region corresponding to brain is traced manually [4,5,8,18]. Analysis of functional signals is then performed on pixels judged to be within the brain mask. One important limitation of this method is that it relies significantly on operator judgement which, we hypothesize, causes unintended variability in border selection. Additionally, this approach is limiting because it does not offer flexibility to remove individual pixels within the larger field-of-view, for example to mask regions of low signal-to-noise due to overlying venous sinuses or optical defects in the cranial window.
Lack of whole-brain coverage further complicates the process of combining data across multiple scans. Since the imaged region may vary between subjects, or even sessions from the same subject, an individual pixel might correspond to brain in one mouse but not in another. The most common method to account for variation has been to perform the final analysis only on the intersection of masks from all sessions (i.e., only on those brain areas that were imaged in every session) [10,11,18,19]. The only advantage of this method is simplicity. If there is only slight variation between sessions, then the number of pixels lost by taking the intersection may be minimal, but for studies that include a large number of subjects, even with an ideal field-of-view, the number of pixels lost can become substantial. Similarly, any session with a small field-of-view must be excluded, or else it would corrupt the global analysis for the entire population.
In this contribution, we improve on existing methods by developing semi-automated optical brain image segmentation techniques and averaging methods that tolerate heterogeneity in the field-of-view between scans. Briefly, we first created automated pixel-wise quality control metrics to exclude low quality pixels from the data analysis. Previously, systematic quality control has been used to exclude entire imaging sessions [10,11], but to our knowledge it has never been used on a pixel-wise basis. The initial mask created by this automated process is then used as an initial condition for manual brain segmentation, which we show improves unintended variation in segmentation. Secondly, we developed methods to combine these data across imaging sessions and across subjects in a censored fashion. The new averaging approach accounts for varying fields-of-view and brain masks, with the understanding that data in any pixel may only be partially known. The success of these visual image processing algorithms are demonstrated with resting-state OIS data from mice, but the same fundamental approach should be applicable to task-based paradigms and for other optical modalities such as diffuse optical tomography. Broadly, these techniques should permit an easier, standardized, and more thorough analysis in functional studies that employ optical imaging techniques to study the brain and have large sample size. Thus, they will improve study fidelity and aid the development of imaging biomarkers.

Methods
Definitions of the variables used herein are supplied in Table 1, as well as at first use.

Optical intrinsic signal imaging system
The optical intrinsic signal (OIS) imaging system is similar to that described previously [4] ( Fig. 1(A)). Briefly, illumination is derived from a 470 nm LED (Thorlabs M470L3-C1). Sequential images were acquired with a cooled, CCD camera (iXon 887, Andor Technologies). Crossed polarizers were used to eliminate light signal from specular reflection. The system was controlled with custom-written software using Matlab and the Andor software development kit.

Animal preparation and imaging
All procedures were approved by the institutional animal care and use committee (IACUC) at the Children's Hospital of Philadelphia. Male C57bl/6 mice (ages 8 to 13 weeks) were anesthetized with a mixture of ketamine (100 mg/kg) and xylazine (10 mg/kg) through intraperitoneal injection. After adequate anesthesia was achieved (usually after approximately 10 minutes), the animal was held in place with ear bars and kept warm with a heating pad. The hair on the dorsal surface of the head was removed with a depilatory cream, and the scalp was cleaned with iodine and ethyl alcohol. The scalp was incised and reflected to expose the skull from the olfactory bulb (anteriorly) to the superior colliculus (posteriorly) with as much lateral exposure as possible. Either the exposed skull was coated with a thin layer of mineral oil to create a smooth optical interface, or a glass through-skull cranial window was placed using transparent dental cement [20] (Fig. 1(B)). This procedure usually took about 10-15 minutes, and imaging began about 20-30 minutes after anesthetic injection.

M(n, m)
Boolean mask for correlation coefficient matrix.

F(n, m)
Matrix of Fisher-transformed correlation coefficients.
a The subscript i denotes the segmentation method.
Resting-state time series were acquired in 5 minute imaging runs. Three to six runs were obtained sequentially in each mouse (15 to 30 minutes of total imaging time). Data from the camera consists of a series of images of light intensity over time, Φ(x, y, t). The two-dimensional images cover the dorsal surface of the mouse brain and surrounding tissue ( Fig. 1(C)), with the x direction indicating right-to-left and the y direction posterior-to-anterior. The field-of-view of the camera was about 1.5 cm (sampled by 128 pixels for a pixel size of approximately 100 µm × 100 µm) along both the x and y directions. Each 5-minute run consisted of 8928 frames sampled Schematic of the field-of-view demonstrating the goal of segmentation. The exposed brain is shown in light gray, with surrounding hair and skin in dark gray. Major suture landmarks are shown in red. Points used for the atlasing affine transformation are in yellow. at 29.76 Hz. To highlight the utility of our statistical methods for wide variations in field-of-view, we also utilized runs that were suboptimal for analysis; errors associated with suboptimal runs include camera saturation and limited dental cement application.

Masking
Rather than performing a quality-control procedure to keep or eliminate entire runs, we developed a pixel-wise masking procedure to remove pixels likely to contain data unrelated to neural hemodynamics. This procedure was designed to ensure high data quality and to serve as a guide for later brain segmentation. Masks were created to remove (1) saturated pixels, (2) low signal-to-noise pixels, and (3) pixels whose data are uncorrelated with their neighbors. In each case, the mask consists of a Boolean matrix of pixels, B, where a value of one indicates a pixel to be kept for analysis, and zero indicates a pixel to be excluded.
First, if at any time, a pixel in the camera was saturated (bit depth 2 14 ), then the entire time course for that pixel was excluded: 14 for all t; 0 if Φ(x, y, t) ≥ 2 14 for any t. (1) In order to remove temporal drift, a linear trend was removed from every pixel (maintaining the same mean). Specifically, let the mean value of each pixel be: Then, each pixel's data was fit to linear trend: The trend was then removed: For future use, we define the temporal standard deviation for each pixel: To remove pixels with low signal-to-noise, we hypothesized that system noise should follow a Poisson process with the standard deviation of each pixel increasing linearly roughly with the square root of the mean intensity of each pixel. This expectation was borne out by visual inspection of the data (see Results). So, the standard deviation data across each session was fit to a linear relationship: Pixels were considered to have good signal-to-noise if their standard deviation was within a tolerance of this function; otherwise they were excluded: For the present data, we employed a value of λ 1 equal to √ 2, which was found to be a good discriminator. This value was chosen based on visual inspection of the resulting masks.
Finally, we performed a correlation analysis of each pixel with each of its four neighboring pixels: r (x−1,y) , r (x+1,y) , r (x,y−1) , and r (x,y+1) . For a pixel to be included, these correlation values had to meet a minimum threshold (λ 2 ): 0 if any r<λ 2 .

(8)
Note that this method is similar to the fMRI metric regional homogeneity (ReHo) [21], although it relies on correlation coefficients rather than Kendall's coefficient of concordance. Our expectation was two-fold. First, in pixels corresponding to regions in space where few photons entered tissue and diffused (e.g., due to hair or overlying large vasculature), there would little correlation between neighboring pixels. Secondly, along interfaces (e.g., brain-skin at the edge of the cranial exposure), there would be less correlation between adjacent pixels. Note also, with the system's field-of-view, the pixel size of the camera is well above the speckle size for the incident light. For our data, we found that λ 2 of 0.1 was a good discriminator; again this value was chosen based on visual inspection of the data and resulting masks.

Brain segmentation
Two methods were used for brain segmentation and then compared. The first method was what has been used in prior literature. A false-color image of the mouse brain acquired by the camera was viewed, and the brain was manually segmented from its non-neural surroundings by drawing a border around the brain: For the remainder of the manuscript, we refer to this method as manual segmentation.
For the new, second method, the three brain masks created above were combined to create one mask per imaging session: This combined mask was then applied to the same false-color image of the mouse brain. In this way, the person performing the segmentation could see which pixels had been automatically excluded. Using these masked pixels as a guide to where the brain/surrounding-tissue interface should be, further regions were excluded from the brain segmentation manually. Pixels eligible for further processing needed to be within this segmentation and to pass the combined quality metrics: For the remainder of the manuscript, we refer to this method as guided segmentation.
Our hypothesis was that the guided segmentation would create more consistent brain segmentations than the manual segmentation. To test this hypothesis, for each run, manual segmentation and guided segmentation were both performed twice by one rater (BRW) and once by a second rater (JPC). Both intra-and inter-rater agreement were quantified using three metrics. One metric, the Dice coefficient is a common metric for measuring segmentation overlap; however, it is relatively insensitive to error in settings of high overlap. We also employed the Jaccard coefficient, which is similar to the Dice coefficient, but is more sensitive to error. Finally, since the segmentation errors are entirely errors in the border, for a third metric we used the boundary F1 score (Matlab function bfscore), which compares agreement between the segmentation border contours. For all of these metrics, the possible values range between zero (no agreement) to one (perfect agreement). Values for the metrics for both intra-and inter-rater comparisons are presented as medians and interquartile ranges (IQRs). To judge improvement with the new methodology, the value of each metric for manual segmentation was compared against that for guided segmentation using paired t-tests.

Image processing
Further analysis beyond the raw images was performed separately with both segmentation methods, with procedures similar to those in prior reports [4]. Intensity measures were converted to relative absorption changes using the modified Beer-Lambert law: where L is the optical path-length in tissue. Pixel time traces were then filtered to select frequencies from 0.008 to 0.09 Hz (i.e., the low-frequency hemodynamic fluctuations responsible for functional connectivity [1]). Data were then down-sampled to 1 Hz.

Spatial smoothing.
Each image of absorption change was then smoothed with a Gaussian kernel, g (5 × 5 pixel box, 1.3 pixel standard deviation, normalized to total value 1) [4]. In prior publications, which is also the approach we adopt for our analysis of manually segmented data, this smoothing was not affected by the segmentation: By contrast, for guided segmentation images processed with pixel-wise quality metrics, only the masked pixels were included in the Gaussian smoothing: . With this method, the number of pixels whose values are included in each smoothed pixel is: A pixel was included in the final analysis if either: (1) it had itself passed the quality metrics, or (2) its value could be interpolated through the smoothing algorithm by using at least ten pixels (out of 24 neighboring pixels within the 5 × 5 pixel smoothing box) that had passed the quality metrics: Thus, after Gaussian filtering, we effectively had a new composite image mask with some of the gaps from masking filled in by interpolation.

Affine transform.
From a representative image for each session, two landmarks were manually located: the lambda (at the midline sagittal junction of the cerebrum and superior colliculus) and the midline sagittal junction of the cerebrum and the olfactory bulb ( Fig. 1(C)). Using these references, all sessions were affine transformed without shear (i.e., translation, rotation, and one stretch parameter were allowed to vary) to a common atlas space. Both brain masks (manual and guided) for each mouse were correspondingly transformed.

Global signal regression.
A global signal was then created by averaging over all pixels within the brain segmentation: and The global signal was regressed from all pixels before further analysis: where the subscript, i, corresponds to either manual, M, or guided, G, segmented data.

Functional connectivity analysis and statistics
For simplicity of analysis, the data series ∆µ a (x, y, t) is reshaped to be a N-by-T matrix, ∆µ a (n, t), wherein there are N pixels and T time-points. Additionally, for simplicity we drop the prime superscripts (as all data forward will have undergone spatial smoothing and global signal regression) and the subscript denoting the segmentation method (since the methods below apply to either segmentation). Without loss of generality, each pixel's time series was scaled to zero-mean and unit-variance.

Correlation matrix and masking.
With this procedure, we created the matrix of correlation coefficients: Since correlation coefficients between pixels are of interest only if both are in the brain mask, we constructed a mask for this R matrix: M(n, m) is unity if R(n, m) is a correlation coefficient between two pixels within the brain mask, and it is zero otherwise. For seed-based functional connectivity analysis, a pixel of interest (the seed) is chosen, and the corresponding row from the connectivity matrix, R, selected. These correlation coefficients are then reshaped to 128 × 128 pixels and are displayed as a map of correlation coefficients on the surface of the mouse brain (with the equivalent row of M providing the brain segmentation). Seeds were selected based on the expected locations of canonical cortical regions from histological mouse brain atlases.

Averaging.
The correlation coefficients are not normally-distributed (they are bounded by −1 and 1). Therefore, for statistical analysis, they are converted using Fisher's transform to an approximate normal distribution: F(n, m) = arctanh(R(n, m)). This statistic has the advantage that values can be more accurately averaged compared to averaging raw correlation coefficients [22,23].
If the viewed area was the same for every session and for every mouse, then it would be straight-forward to create an averaged correlation (transformed) matrix, because F could simply be averaged across sessions. However, since B and M are different for each session, we must account for this variation. In previous work, analysis was performed on the intersection of all brain masks; we will refer to this procedure in this manuscript as the intersect method. Let the number of sessions for a given mouse be S, and let the subscript s denote the relevant matrix from each session. Further, let the subscript I denote overall data averaged with the intersect method. Then: We propose to explore the alternative averaging method which treats the data as censored, that is, the data are only partially known. For the purpose of this manuscript, we call this procedure the censored method. Let the subscript C denote overall data averaged with the intersect method.
We can also keep track of the number of pixels included in each component of F C : For display, the averaged F is converted back to a correlation coefficient: R(n, m) = tanh (F(n, m)).
Seed-based correlation analysis is performed as before with a seed pixel being chosen and the corresponding row of R displayed with the brain mask being the same row of M.
For the intersect method, as one might expect, the final brain segmentation is the intersection of the component brain segmentations. That is, every non-zero row of M I is equal. On the other hand, with the censored method, the brain segmentation for each seed-based map depends on the seed pixel. For example, consider the case of two sessions with overlapping but non-identical segmentations. Then, pixels with data only from session 1 would have no correlation coefficients with pixels that have data only from sessions 2. Thus, it is important to keep track of M C so as to be able to construct the appropriate field-of-view for any given seed pixel.
Now, that we have averaged over mice, an equivalent averaging procedure can be used to average across mice, by replacing the products and sums over sessions with products or sums across mice.

Results
Seven mice were scanned using our custom-built OIS imaging system. The mice were imaged in 5 minute scans with 3 to 6 scans per mouse. To highlight the ability and value for combining different fields-of-view, mice with suboptimal sessions were included in the data set. Two of the mice were scanned with exposed skulls covered by a thin layer of mineral oil. Five mice were scanned with through-skull cranial windows secured with dental cement; in three of these mice, insufficient dental cement was used, and thus the field-of-view was restricted to the medial cortex. Additionally, there are sporadic other artifacts in the images including camera saturation and bubbles in the dental cement, which will be identified and processed by the pixel-wise quality control metrics.

Brain segmentation
The first step in data processing is quality assessment performed on a pixel-wise basis. Saturation of the CCD camera was rare, but it occurred in some sessions, and in one run it resulted in the exclusion of a relatively large number of pixels in the center of the field-of-view (Fig. 2). The median number of pixels excluded by this metric across all sessions was 0 (IQR: 0-0; range: 0-226, 0-1.4%). Note, that there are 16,384 (128 2 ) total pixels in the image.
We next masked pixels based on signal-to-noise (Fig. 3). For each session, each pixel's standard deviation over time was plotted against the square root of its mean value. We expected this plot to follow a linear relationship (see Methods). By visual inspection, this expectation was true, with the majority of scans having a tight linear distribution (Fig. 3(B)). The majority of pixels that failed this signal-to-noise threshold fell along the brain-skin interface, but a number of pixels were also excluded from the central area of brain indicating that not all central pixels should be assumed to be of equal quality, as would occur with fully manual segmentation ( Fig. 3(C-D)). For other scans, more noisy pixels were present (Fig. 3(F-H)), but again the linear relationship with the square-root of mean pixel counts held, and pixels that were excluded helped to outline the brain interface. The median number of pixels excluded by this metric across all sessions was 208 (1.3%; IQR: 148-300, 0.9-1.8%; range: 19-565, 0.1-3.5%).
The third quality metric is based on the local correlation between adjacent pixels (Fig. 4). Across the visualized cerebral cortex, local correlations are high (near one). Note that these local correlations will not be equal to the correlation coefficients between adjacent pixels in later functional connectivity analysis because, during the current masking step, the data has not undergone temporal filtering and spatial smoothing, which increase local correlation coefficients. As expected, local correlations are very low (near zero) in the regions covered with fur (the upper right and upper left corners of Fig. 4(B, E)); in these cases, the photons do not enter tissue. Local correlation is also low at the border regions (either brain-skin or at the edge of the dental cement, arrows in Fig. 4). The mask thus created is also helpful delineating the brain region and for guiding manual segmentation. Overlying venous sinuses (seen along the midline, for example) also result in lower local correlation. This metric removed the largest number of pixels (due to the large areas of hair in the full images) with a median number of pixels across all sessions excluded of 8193 (50%; IQR: 6661-10011, 41-61%; range: 2687-11048, 16-67%).
The structure of the masks using these metrics was generally similar across runs during the same imaging session (Fig. 5). In all cases, the masks outlined the interface between brain and surrounding tissue. Additionally, the venous sinuses that run along the sagittal and coronal sutures were frequently removed by masking. However, the masks also captured sources of noise specific to each run: either large areas of spurious signal (Figs. 2(D) and 3(H)) or single pixels unique to each run (Fig. 5).  Fig. 2(A)) shown as an anatomic reference. (B) The standard deviation of each pixel over time is plotted against the square root of the intensity. Data is from mouse 2, session 5. The linear regression line of best fit is shown in green. Pixels meeting the SNR threshold are colored blue while pixels that failed and will be excluded from analysis are shown in red. (C) Image of the standard deviation for each pixel divided by the square root of the mean intensity for the same data. (D) SNR mask created from the data in C. In this session, only a few pixels were excluded based on this metric (shown in black). (E-H) The same analysis as A-D, now for mouse 3, session 3. Here, while the data is noisier, the same linear relationship holds. Pixels at the border of the dental cement (green arrows) show lower SNR and are excluded (black pixels in H). Additionally, there is a small region in the left sensorimotor cortex (blue arrows) that has low signal-to-noise, which upon examination of the data seems to be due to instability in the camera read-out. Individual low quality pixels can be excluded, and the entire run need not be discarded.
When combined, the three masks form a basis for the guided segmentation. While manual segmentation resulted in errors at the borders of the brain, the semi-automated segmentation resulted in improved intra-rater reliability (Fig. 6). This improvement was quantified. Guided segmentation resulted in an increase in all measures of agreement ( Fig. 7 and Table 2). The Dice coefficient, although relatively insensitive to small variations in overlap, was high for both methods and did exhibit a significant increase with guided segmentation. The Jaccard coefficient, which is more sensitive to small differences, also significantly increased with guided segmentation. The boundary F1 score, which is most sensitive to small differences in the boundary position was highly variable with manual segmentation, but was consistently high with guided segmentation (Fig. 7 and Table 2). As expected, inter-rater agreement was lower than intra-rater agreement, but inter-rater agreement was much improved by using guided segmentation. With manual segmentation, the boundary F1 score, in particular, was very low. Interestingly, with guided segmentation, all measures of agreement were nearly as good as intra-rater agreement ( Fig. 7 and Table 2).

Single-session functional connectivity
In addition to serving as a guide for segmentation, masked pixels were removed from further analysis, including smoothing and global signal regression. One might hypothesize that this masking should result in cleaner hemodynamic signals and less noise in the correlation maps. We have examined this question for a variety of scenarios and summarize some of these findings below. Generally, when a seed was chosen from a pixel that passed all the quality metrics,  Fig. 2(A)) shown as an anatomic reference. (B) Image of the minimum value of the correlation coefficients from each pixel's surrounding four pixels. Areas covered with hair (in the four corners of the image) have values around zero; boundary regions (e.g., brain-skin) have low correlations as well. Data is from mouse 2, session 5. (C) Mask created from the data in B (excluded pixels in black). Note that the mask highlights boundary areas where the brain region meets the reflected skin flaps (pink arrows). (D-F) The same as A-C for mouse 3, session 3. As with other metrics, the data from this mouse is shown to be noisier, but the mask highlights the edge of the dental cemented region (green arrows). In all sessions, pixels are often removed from the region of the confluence of venous sinuses where the cerebrum meets the olfactory bulb with this mask (cyan arrows).
there was not an appreciable difference between the maps using either segmentation method ( Fig. 8(A-B)). Similarly, if a seed was chosen from a pixel that failed a quality metric, but was surrounded by good pixels, then there was not a large effect (Fig. 8(C-D)). Thus, for this basic level of functional connectivity analysis, spatial smoothing did a reasonable job of removing the effects of noisy pixels even in the absence of masking.
Alternatively, one could choose a pixel as a seed that failed quality metrics and was surrounded by other noisy pixels and was not interpolated using spatial smoothing (these pixels are thus available for analysis only in the manual segmentation data). Such seeds resulted in noisy functional connectivity maps without a clear neurologic basis for the correlations (Fig. 9). This result held regardless of the particular mask that caused a pixel to fail. Thus, when larger blocks of low quality pixels are present, their effect was noticeable and could not be removed by spatial smoothing. Masking of these pixels prevents these noisy functional connectivity maps from being present in the overall connectivity matrix, R.

Averaging across sessions and mice
Use of the intersection method resulted in substantial drop-out of pixels when averaging across sessions. When using manual segmentation, we found that a median of 1378 pixels (8%, range: 819-2108, 5-12.9%) are lost when using intersect averaging across sessions. When using pixel-wise masking with guided segmentation in combination with the intersect averaging Variation in the masks created by the pixel-wise quality metric across multiple runs in the same mouse (mouse 1, sessions 1-5, prior to affine transformation). In this mouse, camera saturation was never present; so, that mask is not shown. Note also, that the mouse was repositioned after Run 1; so that run is slightly shifted relative to the others. For the SNR mask (upper row), an area at the posterior edge of the left visual cortex (red arrow) is masked in most runs, possibly due to pooling mineral oil (used in this mouse) at that location. Additionally, scattered pixels in the center of the field-of-view are excluded in each run. For the local correlation mask (lower row), the masks area are very similar across runs. Common areas excluded were the areas of fur in the upper right and upper left corners, the venous sagittal sinus (blue arrows), and along the brain-skin interface (green arrows). Fig. 6. Demonstration of unintended intra-rater variation in brain segmentation. Data is shown prior to affine transformation. (A) Example false color image from a single imaging frame from the OIS system as used for brain segmentation (mouse 2, session 5). (B) Variation in manual segmentation between two sessions by the same reader. Pixels within the first segmentation are colored red while those in the second segmentation are colored green. The overlap between the two segmentations is shown in yellow. (C) Reduced variation (greater overlap) is seen between two segmentations when using guided segmentation.  method, then more pixels are lost (median: 1563 pixels, 9.5%; range: 1285-2799, 7.8-17.1%). This increased loss is because the pixels excluded in each session do not always match between sessions. With censored averaging no pixels are lost. The resulting improved field-of-view is apparent in the seed-based functional connectivity maps (Fig. 10). In mice with a high degree of overlap between sessions, then all methods were able to do a reasonable job at preserving a high field-of-view (Fig. 10(A-D)). However, in mice wherein field-of-view varied more substantially between sessions, then the censored method resulted in a larger preserved field-of-view (compare Fig. 10(E) to Fig. 10(G) and Fig. 10(I) to Fig. 10(K)). When utilizing pixel-wise quality metrics in combination with the intersection method, the problem is even worse, as every pixel excluded in each single session necessarily reduces the overall field-of-view ( Fig. 10(F, J)). Fig. 8. The effects of the pixel-wise quality metrics on the results of seed-based functional connectivity analysis in single sessions. Data is shown from mouse 1, session 1 after affine transformation. First a seed was chosen from the left motor cortex (black circle) that had passed all quality metrics. Functional connectivity maps are shown using both manual (A) and guided segmentation (B). The two maps are similar; thus the presence of low quality pixels at distant locations has little effect on the overall map. Then, a nearby pixel also in the left motor cortex was chosen as a seed (black circle). This seed failed the local correlation mask, but was filled in by interpolation during spatial smoothing and thus is present in both segmentations. Maps from manual segmentation (C) are similar to that from guided segmentation (D). Thus, in this basic analysis, the Gaussian spatial smoothing is able to ameliorate the effects of isolated low quality pixels even without masking.
When combining data from all mice, the improvements with censored averaging are even more apparent (Fig. 11). With manual segmentation and intersect averaging, 3.00 × 10 7 correlation coefficients remain (only 11.1% of the 128 4 = 2.68 × 10 8 possible correlations). With guided segmentation and intersect averaging, only 1.57 × 10 7 correlation coefficients remain (5.9% of all possible correlations). With guided segmentation and censored averaging, 1.57 × 10 8 correlation coefficients remain; this represents a 5.2-fold improvement over the prior standard method and 58% of all possible correlations in the field-of-view. This improvement is slightly exaggerated by the inclusion of suboptimal scans: the three runs with inadequate dental cement restrict the lateral field-of-view of the intersect method field-of-view, and the one run with a large region of camera saturation results in a large hole in left parietal cortex of the intersect method field-of-view. An alternative method to preserve field-of-view would be to simply not use data from mice with poor field-of-view or poor signal-to-noise, but this approach can exclude valuable data. Widely applicable analysis techniques should not rely on having only perfect data to work. Furthermore, the data from individual mice (e.g., Fig. 10) demonstrate that even mice with limited field-of-view still show the expected network structure and their inclusion in the overall average should improve confidence and statistical rigor for data within the imaged regions. Using all the data, even the suboptimal scans, is a more efficient use of mice and scanning time. Fig. 9. Functional connectivity maps performed using seeds that failed pixel-wise quality metrics. All data is after affine transformation. All seeds (black circles) shown in this figure were surrounded by other pixels that failed the quality metrics and were unable to be interpolated using spatial smoothing; thus all maps are shown using manually segmented data only. (A) A map of correlation coefficients using a seed in the frontal cortex along the sagittal sinus (mouse 1, session 5). This pixel failed the local correlation metric. (B) A seed pixel chosen from the region where the camera was saturated in the left retrosplenial cortex (mouse 3, session 3). (C) A seed pixel chosen from a region of low SNR in the left somatosensory cortex (mouse 3, session 3). (D) A seed pixel chosen from a region of low local correlation in the cingulate cortex and sagittal sinus (mouse 3, session 3). The resulting images all consist of patterns without a sensible neurologic network.
The wider field-of-view provided by censored averaging permits mapping of brain regions that are more lateral, anterior, and posterior. Although these areas were not present in every brain exposure, they are readily mapped by the censored method. With the intersect methods some areas such as olfactory and visual cortex are lost. Similarly, the full extent of some networks, such as the lateral somatomotor network are seen only with the censored method. Throughout most of the field-of-view, the correlation map using the censored data is smooth, and it is not readily apparent that a different set of sessions are included in each point. However, along the edges of the censored field-of-view only a small number of sessions are contributing to the overall image. In these areas, the edges between different component fields-of-view are more obvious.

Discussion
We have developed, demonstrated, and quantified novel visual processing methods to improve the analysis of mouse optical neuroimaging data. Pixel-wise quality metrics improved the data quality of the hemodynamic time courses and increased image quality through improved image segmentation. These findings were quantified via decreasing unintended intra-and inter-rater variation in brain segmentation. The new methods lead to improved confidence in the segmented boundaries. Additionally, since the guided segmentation is less user-dependent, the segmentation process is less dependent on expertise. Further, we developed methods to combine functional connectivity correlation coefficients even when the pixels involved differ between sessions. These methods compensate for both varying field-of-view in each cranial exposure and also the varying brain masks introduced by the pixel-wise quality metrics. showing the number of sessions contributing data for each pixel's correlation coefficient when censored averaging is used (here, most pixels use all three sessions). (E-H) Similar data to A-D using data from mouse 3 and a seed in the right motor cortex. Note two large regions that were excluded by the pixel-wise quality metrics from one session cause drop-out in the guided segmentation / intersect method data. In the left parietal cortex, there is a region with signal loss due to camera saturation (green arrows, compare to Fig. 2(D)). In the left somatosensory cortex, there is signal loss due to instability in camera illumination (blue arrows, compare to Fig. 3(H)). The censored averaging method preserves the full field-of-view. (I-L) Similar data to A-D using data from mouse 3 and a seed in the right retrosplenial cortex.
The techniques presented enable a more robust statistical treatment of imaging data, enabling finer control over data quality. One goal of this contribution is to suggest analysis methods that preserve as much usable data as possible. Ultimately, this gain could decrease the number of mice or sessions required and could increase statistical power. Rather than discarding mice with limited fields-of-view, or discarding sessions with areas of poor data quality, these data can be incorporated into the statistical model. This process is similar in concept to the fMRI technique of removing individual frames due to motion [24,25], which enables usable data to be extracted from sessions otherwise corrupted by subject motion. While on an individual session basis, the exclusion of pixels with low signal-to-noise did not visually appear to change many canonical correlation maps, more advanced metrics such as community construction and network analysis can be highly sensitive to data quality and noise. Thus, we expect that further advantages to pixel masking will be apparent in the future.
The quality metrics rely on multiple thresholds to determine which pixels to include. The coefficients chosen here were selected based on visual inspection of the data and the masks generated. Future work is necessary to determine if the values chosen here are optimal and generalizable. For example, different wavelengths of light have different path lengths in tissue, which may affect the masking parameter thresholds (e.g., λ 1 , λ 2 ). Additionally, other optical intrinsic signal imaging systems may have different signal-to-noise characteristics. We would, however, expect that the approaches described here should be generalizable.
Additional insight may be gained by examining the Fourier properties of the data. For example, techniques developed for fMRI such as amplitude of low frequency fluctuations (ALFF) [26] and relative ALFF (rALFF) [27] may be able to differentiate brain from other tissue types. However, the magnitudes of these metrics are not homogenous across the brain [28,29], and their utility for segmentation is unexplored. Furthermore, because optical imaging's high sampling rate avoids the aliasing of high-frequency systemic physiology into the low frequency functional connectivity band, the values of these metrics may be different with optical imaging. This difference from fMRI may be particularly relevant for rALFF, which depends on the reference frequency band chosen for normalization. Understanding the properties of these metrics with OIS imaging will be part of our future work.
Other sources of noise in optical imaging data may require different types of filtering procedures. For example, as awake optical neuroimaging in mice becomes more prevalent [19,30], imaging quality will likely be improved by methods to remove artifacts due to motion [25]. Although mice are restrained in these studies, motion can cause systematic errors in fMRI measurements of functional connectivity even in restrained, anesthetized subjects [31]. Additionally, the effects of motion can differ between populations leading to statistical bias when performing group-level analyses [32]. Censoring individual pixels will likely be inadequate to remove such artifacts and more sophisticated methods, as have been used in fMRI [33], will be required.
Although the methods demonstrated in this manuscript concerned optical intrinsic signal imaging and resting-state functional connectivity, they should be broadly applicable to other imaging modalities. Human optical neuroimaging techniques, such as diffuse optical tomography (DOT), face similar spatial data issues as arise in mice. Variation in DOT image-pad positioning can introduce problems such as the field-of-view not being constant subject-to-subject and session-to-session. When functional data from DOT is mapped onto anatomic imaging from MRI [34,35], variation can be quantified, but this process does not ameliorate the underlying problem that variability imposes. As in prior mouse studies, the intersect method for multi-session and multi-subject averaging is commonly used [35,36], but it suffers from the same problems that were quantified in this paper. Such methods will be severely limiting for future studies, especially those in children or hospitalized patients, for example, where one might expect probe positioning to be extremely variable. Accounting for variations in a rigorous way is important to improving Fig. 11. Demonstration of methods for combining data across mice. Data is shown after affine transformation. Each row is a different canonical resting-state network as shown by seed-based functional connectivity with the seed denoted by the circle (black or red). The first column uses manually segmented data merged using the intersect method. The second column is data segmented using the guided segmentation and then merged with the intersect method. The third column is data segmented using the guided segmentation and then merged using the censored method. The fourth column shows the number of sessions contributing data for each pixel's correlation coefficient when calculated using the censored method. When seeds are selected that are outside of the field-of-view of the intersect method data, then the segmented brain is shown entirely in blue. data reliability. The same general approach described here could be applicable to atlases for functional DOT data.
Future work will address the statistical analysis of merged data, which is nontrivial. Correlation coefficients in the final data will have been computed from a different subset of sessions, and we will need to treat the variance in an individualized manner. Then, the Fisher coefficients can be properly converted to z-scores and p-values. An alternative method for the analysis of resting-state functional connectivity data is independent component analysis (ICA). Similar censoring methods for ICA would be ideal, although the adaptation of ICA algorithms to comprehend partial data may be difficult. Similarly, the general concept of creating a mask to guide data censoring should be applicable to task-evoked activity (from OIS imaging, fluorescent dye imaging, or DOT). General linear model coefficients could be calculated at each pixel using a censored hemodynamic trace and expected canonical response. However, again the analysis of error and degrees of freedom necessary to calculate p-values will need to take into consideration the censored nature of the data.

Conclusion
In conclusion, these rigorous in vivo studies demonstrate that pixel-wise quality metrics, guided segmentation, and censored averaging techniques can increase the amount of usable data and field-of-view obtainable from optical intrinsic signal imaging studies. More sessions can be included, thereby increasing subject numbers and likely increasing statistical power. Moreover, these methods increase the statistical rigor of optical neuroimaging and should enable detection of subtle effects. Ultimately, the resulting better understanding of data quality and its statistical properties will facilitate future work with more advanced methods of interpreting brain network connectivity data.

Funding
National Heart, Lung, and Blood Institute (T32-HL007915); National Institute of Neurological Disorders and Stroke (R01-NS060653); National Institute of Biomedical Imaging and Bioengineering (P41-EB015893); Eunice Kennedy Shriver National Institute of Child Health and Human Development (R37-HD059288); Children's Hospital of Philadelphia.