Mapping Social Behavior-Induced Brain Activation at Cellular Resolution in the Mouse

Understanding how brain activation mediates behaviors is a central goal of systems neuroscience. Here, we apply an automated method for mapping brain activation in the mouse in order to probe how sex-speciﬁc social behaviors are represented in the male brain. Our method uses the immediate-early-gene c-fos , a marker of neuronal activation, visualized by serial two-photon tomography: the c- fos -GFP+ neurons are computationally detected, their distribution is registered to a reference brain and a brain atlas, and their numbers are analyzed by statistical tests. Our results reveal distinct and shared female and male interaction-evoked patterns of male brain activation representing sex discrimination and social recognition. We also identify brain regions whose de-gree of activity correlates to speciﬁc features of social behaviors and estimate the total numbers and the densities of activated neurons per brain areas. Our study opens the door to automated screening of behavior-evoked brain activation in the mouse. and D4 point to regions of high autofluorescence, which cause high rates of false positive detection by other tested methods, but not by CNs. Arrows in d2 indicate an example of dim cells that were not detected by CNs. Arrows in d3 show an example of two neighboring cells that were detected by CNs and further separated by the cell separation algorithm. Scale bar in d and d1 = 1 mm and 100 μ m, respectively. (E-F) Evaluation of CN and human performance depending on the background autofluorescence. Y axis shows precision, recall, and F score from 10 FOV tiles from the ground truth dataset; X axis shows autofluorescence brightness of the brain regions in the particular tiles. The CN (E) and human (F) performance was overall independent of the background, with the exception of the “darkest tile”, which had a lower CN recall of 0.63 (F score 0.76, and precision 0.94). This tile included the caudal olfactory tubercle area, which comprises myelinated fibers passing from the dorsal striatum. This suggests that increased light scattering in areas with myelinated tracks may somewhat lower CN recall performance.


INTRODUCTION
Central to the understanding of brain functions is insight into the distribution of neuronal activity that drives behavior. Local measurements of brain activity in behaving mice can be made with electrodes and fluorescent calcium indicators (Buzsá ki, 2004;Grewe and Helmchen, 2009), but such approaches provide information regarding only a very small fraction of the $70 million neurons that comprise the mouse brain. The detection of elevated levels of the immediate-early genes (IEGs) linked to recent neuronal activity (Clayton, 2000;Guzowski et al., 2005) is a more spatially comprehensive technique. While it lacks the time resolution of electrophysiological recordings or calcium imaging, it does have the potential of providing a complete view of recent whole-brain activity. Once determined, the whole-brain IEG-based map can be used to generate structure-function hy-potheses to be probed by high-resolution recordings as well as optogenetic and chemogenetic methods (Fenno et al., 2011;Lee et al., 2014).
Here, we use a pipeline of computational methods that permits automated unbiased mapping of c-fos induction in mouse brains at single-cell resolution, in a similar way as recently described for mapping the induction of the IEG Arc (Vousden et al., 2014). Specifically, we use serial two-photon (STP) tomography  to image the expression of c-fos-GFP, a transgenic c-fos green fluorescent protein reporter , across the entire mouse brain. The activated c-fos-GFP+ cells are computationally detected, their location is mapped at stereotaxic coordinates within a reference brain, and their numbers and densities per anatomical brain areas are determined within the Allen Mouse Brain Atlas. Finally, region of interest (ROI)-based and voxel-based statistical tests are applied to identify brain areas with behaviorally evoked c-fos-GFP activation.
To demonstrate the application of the computational pipeline to the mapping of behavior-evoked brain activation, we focus on mouse social behavior and generate activation maps representing sex-specific social behaviors in the male brain. Rodent social behavior is an area of intense research, and c-fos mapping, lesion studies, and other functional approaches have been used to identify brain regions that are activated and contribute to male and female sexual behaviors as well as male-male aggressive behaviors (Anderson, 2012;Bia1y and Kaczmarek, 1996;Brennan and Zufall, 2006;Coolen et al., 1996;Pfaus and Heeb, 1997;Veening et al., 2005;Yang and Shah, 2014). Much less is known, on the other hand, about the brain areas activated during the initial period of sex discrimination and social recognition before the manifestation of the correct behavioral response.
Here, we explore the question of sex discrimination and social recognition by limiting the male-female and male-male interactions to a brief 90 s period, during which the behavioral repertoire comprises only social exploratory activity, such as anogenital sniffing, close following, and nose-to-nose sniffing, without mating or aggression. A side-by-side comparison of the female and male interaction-evoked whole-brain activation revealed (1) a broad activation of areas downstream of both the main and accessory olfactory bulb (MOB and AOB) in the male-female interaction and a bias toward structures downstream of the MOB in the male-male interaction; (2) activation of structures related to behavioral motivation during the male-female, but not male-male, interaction; and (3) sex-specific as well as shared hypothalamic activation. Taking advantage of the cellular resolution of the whole-brain data, we then identified brain regions whose level of activation was correlated to specific features of the social behaviors, including regions linked to anogenital sniffing that lie downstream of the pheromone-activated AOB and regions linked to close following that belong to the striatopallidothalamocortical circuitry. Finally, we calculated the total numbers and the densities of c-fos-GFP+ cells per activated brain region of the femaleand male-specific brain data sets, providing a quantitative estimate of whole-brain activation evoked by social behaviors.

Whole-Brain Detection of c-fos-GFP+ Cells in STP Tomography Data Sets
We have established an automated and quantitative wholebrain method for mapping behaviorally evoked c-fos induction

STP Tomography and Computational Detection of c-fos-GFP+ Cells
(A) Imaging and data processing pipeline for mapping whole-brain activation in c-fos-GFP mice.
(B) A sample 280-serial section data set of a c-fos-GFP mouse brain imaged by STP tomography. (C-H) Registration of CN-detected c-fos-GFP+ cells in the RSTP brain. (C) A coronal section shows the autofluorescence signal, which is used for registering the 3D reconstructed sample brain (D) onto the RSTP brain (E). (F) A total of 2,177 c-fos-GFP+ cells were detected in the same coronal section; scale bar, 1 mm. (G) A total of 360,183 c-fos-GFP+ cells were detected in the whole brain, reconstructed in 3D, and (H) registered onto the RSTP brain using the image registration parameters established in the (D) and (E) step.
in transgenic reporter mice expressing c-fos-GFP from a recombinant c-fos promoter   (Experimental Procedures). This necessitated the development and optimization of (1) computational detection of c-fos-GFP+ cells in the mouse brain imaged by STP tomography , (2) 3D registration of the STP data sets to a reference mouse brain, and (3) statistical analyses of the whole-brain distribution of the c-fos-GFP+ cells ( Figure 1A).
The mouse brains were imaged by STP tomography as data sets of 280 serial coronal sections, with x-y resolution 1.0 mm and z-spacing 50 mm, which required an imaging time of $21 hr per brain ( Figure 1B) . To achieve a reliable computational detection of the c-fos-GFP+ cells throughout the whole brain, we used convolutional networks (CNs) that can learn to recognize image features in complex data sets (V. Jain et al., 2007, IEEE, conference;Turaga et al., 2010) (Figure S1; Experimental Procedures). Since nearby c-fos-GFP+ cells were sometimes merged in the CN output, a postprocessing step was devised that could separate such ''touching'' cells ( Figure S1). The CN performance was then quantified on a new set of marked-up fields of view from a second c-fos-GFP brain using the F-score measure, which represents the harmonic mean of the precision and recall (i.e., the false positive and false negative error rate), where F score 1 is the best and 0 the worst. The CN performance reached F-score 0.88 (precision 0.86, recall 0.90), which was comparable to human interuser variability represented by F-score 0.90 (precision 0.90, recall 0.90) ( Figure S1; Experimental Procedures). We conclude that the trained CN provides an automated and highly accurate method for detection of c-fos-GFP+ cells in whole mouse brains imaged by STP tomography.
Anatomical Registration of the Whole-Brain c-fos-GFP Data Results from the CN-based cell counting produce a number of c-fos-GFP+ cells per the individual 280-section data sets, with each cell having an xyz location. To be able to compare patterns of c-fos activation between experimental groups in one common brain volume, we created a Reference STP (RSTP) brain coregistered to the digital Allen Brain Atlas (ABA) for 8-week-old C57BL/ 6 mice (Sunkin et al., 2013) (Figure S2A; Experimental Procedures; Movie S1). The image registrations were done by a 3D affine transformation, followed by a 3D B-spline transformation with Mattes Mutual information as the similarity measure (Mattes et al., 2003). The 3D registration accuracy was calculated to be 65.0 ± 39.9 mm (mean ± SD; (Figures S2B and S2C;Experimental Procedures), which is also the accuracy for the registration of all STP experimental data sets to the RSTP brain for data analysis. The alignment of the RSTP and ABA Nissl brains was further improved by 2D affine and B-spline transformations using STP tomography-imaged CAG-Keima brain, which has a Nissl-like fluorescent labeling from the broadly expressing CAG (cytomegalovirus-IE/chicken b-actin) promoter ( Figure S3A; Experimental Procedures). Finally, the alignment of many ABA anatomical labels was validated, and in some cases manually corrected, based on a comparison to brain structures delineated by tissue autofluorescence or fluorescent protein expression in parvalbumin-, glutamic acid decarboxylase-, and somatostatin-specific transgenic reporters   (Figures S3B-S3D).
Calculation of the Sample Size for c-fos-GFP-Based Mapping of Mouse Brain Activation The RSTP brain allows us to calculate the number of c-fos-GFP+ cells per anatomical ABA regions in the 280-section data sets. To estimate the required sample size for statistical comparisons, we used power analysis on data from a baseline group of mice (Experimental Procedures). The brains of 7 c-fos-GFP mice (no experimental manipulation) were imaged by STP tomography, warped to the RSTP brain, and the c-fos-GFP+ cells were counted per each anatomical ROI. To determine the optimal sample size, Monte Carlo methods were applied to this data to simulate ROI counts for two groups at various effect sizes. As shown in Figure 2, the number (N) of sufficiently powered ROIs (a < 0.05, power > 0.80) increased at an approximately constant rate until N = 10, where it started to plateau. We chose N = 12-13 as sample size per group, which assures high statistical power for most ROIs.
The Selection of the Social Behavioral Protocols and Characterization of c-fos-GFP Induction Interactions between a male and a female mouse, and between a male and a male mouse, include initial common social behaviors, such as anogenital sniffing and close following, and consequent sex-specific behaviors, such mounting and fighting. In the current study, we wished to focus on the comparison of brain activation evoked during the initial social exploration-based phase of the male-female and male-male interactions, during which the male is expected to recognize the social stimulus and to discriminate the sex of the interacting partner.
The social comparison was based on two experimental groups. In the male-female interaction group, an ovariectomized (OVX) conspecific female was introduced for 90 s in the home cage of a naive c-fos-GFP+ male, while in the male-male interaction group, a conspecific male was used as the 90 s stimulus (Movie S2). As described before in studies of social recognition , the brief interaction period included exploratory behavioral activities of anogenital sniffing, close following, and nose-to-nose sniffing, but no sexual behavior or aggression ( Figure S4). The OVX female, which was recognized by the male as a social stimulus comparable (B and C) Examples of ABA ROI segmentation (B) and the corresponding c-fos-GFP+ cell counts (C): hippocampus: dark blue; 33,508 cells; medial amygdalar nucleus: light blue; 3,035 cells; nucleus accumbens: green; 13,627 cells; and infralimbic cortical area: red; 4,665 cells. (D) Further segmentation of the infralimbic region by cortical layers; top shows the layer ROIs, from layer 1 (orange) to layer 6 (purple), and bottom shows the c-fos-GFP+ cell counts (ILA1 = 223, ILA2 = 243, ILA2/3 = 1,572, ILA5 = 1,731, ILA6 = 896 c-fos-GFP+ cells). The spacing between the layers was enlarged for better visualization. (E and F) Estimation of the sample size based on power analysis of c-fos-GFP+ cell counts. (E) The simulation of the relationship between the number of sufficiently powered ROIs and the sample size shows a steep increase until $N = 10, which then begins to plateau. For the current study, we chose a sample size of N = 13 (dashed line). (F) The plot of the relationship between the statistical power of each ROI and the effect size for N = 13 group. Of the total 763 ROIs analyzed, 601 (78.8%) showed sufficient statistical power at the effect size 0.6 and 699 (91.6%) at the effect size 1.0.
to an intact female ( Figure S4), was chosen to limit experimental variability due to the estrous cycle Winslow, 2003).
For control, we included four groups. Baseline group included mice that were not handled or otherwise manipulated. The handling group included mice that were transferred to the experimental area for 90 s, the object group included mice that received a novel object for 90 s, the olfactory group included mice that were exposed for 90 s to a novel object enriched with banana-like odor (isoamyl acetate [ISO]; note that ISO is a monomolecular odor and as such it is likely to induce simpler activation patterns compared to complex volatile odors.).
In order to characterize the time course of c-fos-GFP induction, we used the 90 s ISO stimulation and tested c-fos-GFP increase in the main olfactory bulb at 0.5, 1.5, 3, and 5 hr poststimulus. This protocol revealed a peak induction at 3 hr after the stimulation, which returned to the baseline level at 5 hr . The time of 3 hr poststimulus was selected for analysis of all behavioral experiments.
In order to compare the c-fos-GFP signal to native c-fos signal, we analyzed female interaction-driven induction in eight selected brain regions by anti-c-fos immunohistochemistry in wild-type C57BL/6 mice and by STP tomography in c-fos-GFP mice (note that the c-fos signal was analyzed at 1 hr poststimulus because of the short half-life of the native c-fos protein). Overall, the c-fos-GFP signal represented 59% ± 6% (mean ± SEM) of anti-c-fos immunosignal, indicating that the direct c-fos-GFP fluorescence detects approximately 60% of all c-fos induced cells ( Figures S5D-S5I). Importantly, the female interactiondriven increase was also highly comparable between the wildtype and c-fos-GFP mice ( Figures S5D-S5I).

ROI-and Voxel-Based Statistical Analyses
The distribution of the c-fos-GFP+ cells among the different behavioral groups was compared using ROI-and voxel-based statistical tests corrected for multiple comparisons by false discovery rate (FDR) (Experimental Procedures). The 694 ROIs analyzed represent the segmentation of the RSTP Brain volume by the ABA anatomical regions and the c-fos-GFP cell counts are compared ROI-to-ROI between the experimental groups (Experimental Procedures). The RSTP brain voxelization (done by overlapping sphere voxels of 100 mm diameter) generates discrete digitization unbiased of anatomical regions and the c-fos-GFP cell counts are compared voxel-to-voxel (Experimental Procedures). The voxel-based statistics can reveal ''hot spot'' areas of activation and subregional differences within the anatomical ROIs ( Figure S6).
In the first ROI analysis, the comparison of the male-female, male-male, olfactory, and handling groups to the baseline group revealed broad patterns of brain activation, with $69%, 76%, 79%, and 35% of ROIs activated by the respective manipulations (Table S1). Since all ROIs activated in the handling group were also activated in the other three groups, the handlinginduced brain activation represents nonspecific shared stimuli, such as moving the cage to the experimental area. In order to determine the stimulus-specific brain activations, we next compared the male-female, male-male, and olfactory groups to the handling group by both ROI and voxel-based analysis.

Female and Male Interaction-Evoked Brain Activation
It has been proposed that the detection of volatile pheromones by the main olfactory epithelium (MOE) and MOB is necessary for sex discrimination (Baum and Kelliher, 2009) (see Discussion). However, the mechanism of such detection at the level of downstream brain structures is not known. A comparison of the female and male interaction-evoked activation by ROI statistics revealed largely overlapping c-fos-GFP induction among MOB-connected brain regions, including the anterior olfactory nucleus (AON), piriform cortex (PIR), nucleus of the lateral olfactory tract (NLOT), anterior amygdala area (AAA), piriform-amygdala area (PAA), anterior and posterior lateral cortical amygdala (COAa, COApl), and entorhinal cortex lateral (ENTl), in addition to a female specific activation of taenia tecta (TT) and postpiriform transition area (TR) (Figure 3; Table S2; note that the heat map data in Figures 3, 4, and 5 show statistical significance, while the magnitude of c-fos upregulation is provided in Figure S7). A further analysis by voxel-based statistics revealed a mainly dorsal MOB activation by both stimuli and a clear dorsal-ventral separation between the two stimuli in the PIR and ENT (Figures 3C-3F; it should be noted that overlapping voxel activation between the male and female data sets, seen as yellow areas in Figures 3C-3F, represents activation of the same area, but not necessarily the same neurons). These data suggest that spatial organization of the dorsal MOB outputs leads to activation of distinct neuronal populations in the PIR and ENT, which may contribute to sex discrimination in the male brain.
The sensing of nonvolatile pheromones by the vomeronasal organ (VNO) and AOB has been proposed to play a critical role in mate recognition and behavioral motivation (Baum and Kelliher, 2009). Our analysis of brain regions downstream of the AOB revealed a strong bias toward the female interactionevoked brain activation, including the AOB granular cell layer (AOBgr), the posterior medial cortical amygdala (COApm), the entire medial amygdala (MEA), bed nucleus of the accessory olfactory tract (BA), and bed nuclei of the stria terminalis (BST) (Figure 4; Table S2; Movie S3). In contrast, male-male interaction induced activation in fewer AOB-linked areas, including the BA and MEA anterior dorsal (ad), anterior ventral (av), and posterior dorsal (pd) (Figure 4; Table S2). Voxel analysis revealed focal activation in the AOB in the male-male interaction ( Figure 4C) and a largely overlapping activation in the MEAad, av, and pd in the male-female and male-male data sets (Figures 4D and 4E; Table S2; Movie S3).
The male-female interaction also showed strongly evoked activation of brain areas linked to behavioral motivation, including the olfactory tubercle (OT) and nucleus accumbens shell (ACBsh) of the ventral striatum, prelimbic, infralimbic, and orbital medial (PL, ILA, and ORBm) prefrontal cortical areas, agranular insular cortex (AI), substantia innominata (SI; also known as ventral pallidum), medial dorsal thalamus (MDm), hippocampal ventral subiculum (SUBv), and serotonergic dorsal raphe (DR) ( Figure 5A; Table S2). In contrast, the male-male interaction had a comparable induction only in the AI; much weaker activation in the prefrontal cortices, SI, OT, and SUBv; and no significant activation in the MDm and DR ( Figure 5B; Table S2). Voxel-based analysis revealed that activation in the medial prefrontal cortices in the male-male interaction was limited to superficial cortical layers ( Figure 5C; Movie S3). We also observed a focal activation in the DR at a specific A/P bregma location in the male-female data set ( Figure 5E; Movie S3).
The activation of the septal and hypothalamic nuclei is known to mediate both sexual and defensive/aggressive behaviors (Anderson, 2012;Swanson, 2000). We therefore asked whether the brief interaction used in our experiments was  Table S2 for ROI full names. (C-F) Voxel-based analysis revealed activation pattern selective for the female stimulus (red), the male stimulus (green), and shared by both stimuli (yellow). (C) Both male and female stimuli induced dorsal activation in the MOB. (D-F) Dorsoventral separation was detected between the male-and female-evoked activation in the PIR (D and F) and ENT (E and F). See also Movie S3 for the full data set. sufficient to activate these regions even though it lacked overt mating and fighting. The ROI analysis revealed that the female stimulus induced activation of the rostral lateral septum (LSr) and neuroendocrine nuclei, including the medial preoptic nucleus (MPN), medial preoptic area (MPO), ventral premammillary nucleus (PMv), ventrolateral part of the ventromedial nucleus (VMHvl), paraventricular hypothalamic nucleus (PVH), dorsomedial hypothalamus (DMH), anteroventral periventricular nucleus (AVPV), posterior periventricular hypothalamic nucleus (PVp), and tuberal nucleus (TU) ( Figure 6A; Table S2). The male stimulus activated the VMHvl, DMH, PVH, PVp, and TU from the structures of the male-female data set, in addition to a male-specific activation of the dorsomedial part of the ventromedial nucleus (VMHdm), the anterior, preoptic, and intermediate periventricular nuclei (PVa, PVpo, PVi), retrochiasmatic area (RCH), subparaventricular zone (SBPV), supraoptic nucleus (SO), and arcuate nucleus (ARH) ( Figure 6B; Figure S7; Table S2). Voxel-based analysis revealed very distinct and focal LSr activation at A/P coordinates between +0.345 and À0.145 ( Figure 6C; Movie S3). The activation in the VMHvl, which was previously shown to play a role in both sexual and aggressive behaviors , was highly overlapping between the male-female and malemale data sets ( Figure 6D; Movie S3). In addition, only a medial part of the PMv was activated in the male-male data set, suggesting a functional subdivision within this structure ( Figure 6E; Movie S3).
Finally, among additional brain areas, the claustrum (CLA), basomedial amygdala (BMA), and intercalated amygdala (IA) were activated by both the female and male interactions; the capsular central amygdala (CEAc), basolateral amygdala (BLA), and thalamic parataenial nucleus (PT) were activated only in response to the female stimulus; and the temporal associational, perirhinal, and ectorhinal (TEa, PERI, and ECT) cortical areas were activated only in response to the male stimulus (Table  S2). The activation of the hippocampal CA2 region linked to  Table S2 for ROI full names. (C-E) Voxel-based analysis revealed a largely overlapping activation pattern (yellow) in the coactivated AOB (C), MEAad and MEAav (D), and MEApd (E), and selective female-evoked activation in the MEApv and COApm (E). See also Movie S3 for the full data set. social memory (Hitti and Siegelbaum, 2014) was also detected in both the male-female and male-male data sets (Table S2).

Social Behavior-Specific Brain Activation
In addition to the male versus female comparison described above, we also asked which of the activated brain regions are specific to social behavior, i.e., are shared between the male-female and male-male data sets and are not activated in response to a nonsocial stimulus represented by a novel object enriched with a volatile odor (banana-like ISO).
First, we compared the ISO data set to the handling control. This analysis revealed the expected activation of the PIR and other areas downstream of the MOB, which was similar to the social behavior-evoked activation (Table S2). The activation throughout the rest of the brain, however, was highly divergent from the pattern evoked by the social stimuli, as it included many cortical areas, the entire hippocampus, and the hypotha-lamic subfornical organ (SFO) regulating autonomic functions (Smith and Ferguson, 2010); the suprachiasmatic nucleus (SCH) regulating sleep, waking, and locomotor activity (Saper et al., 2005); and the arcuate nucleus (ARH) linked to feeding (Sternson, 2013) (Table S2).
Second, we compared the shared male-female and malemale brain activation to the ISO data set. This analysis revealed the subset of areas specific to social behavior, which included the amygdalar regions BA, COApl, MEAav, MEApd, BMAp, BLAv, and PA, the hypothalamic VMHvl and PVH, and the SI (Figure 7; Table S3).

Correlation of c-fos Activation to Time Spent in Social Behaviors
The time spent in a specific behavioral activity may be expected to correlate to the number of c-fos-GFP+ cells in brain regions driving this activity. We next tested whether this correlation  Table S2 for ROI full names. (C-E) Voxel analysis showed that (C) the entire ventral part of PL and dorsal half of ILA was activated by the female stimulation, while only the upper layers of the same regions were activated by male stimulation. (D) Ventral striatum (ACB, SI, OT) showed a patch-shaped, strong activation pattern by female stimulus, but not by male stimulus. (E) Voxel analysis pinpointed the maximal activation in the DR by the female stimulus at A/P coordinate À4.78. may be used to functionally link the activated brain areas in the male-female and male-male data sets to specific features of the social behavior.
The correlation to the time spent in anogenital sniffing identified mainly areas connected to volatile and nonvolatile olfactory signaling, such as the COAa, COApl, COApm, MEA, and BST, and hypothalamic neuroendocrine areas including the MPN, PMv, and VMHvl (Table 1). Correlation to the time spent in close following identified some of the same areas, such as the MEA and BST, but also areas linked to behavioral motivation, including the ACB, OT, SI, ILA, PL, ORBm, MDm, and DR (Table  1). Finally, the correlation to the time spent in nose-to-nose sniffing did not identify any positive association, suggesting that this behavioral feature is not quantitatively linked to any brain regions in our data sets. These data suggest that distinct aspects of the social behavior engage distinct sets of brain areas and that whole-brain cellular c-fos-GFP analysis is able to reveal this structure-function relationship.
Calculation of the Density of c-fos-GFP+ Cells per ROIs While the above analyses identified the activated brain areas, the cellular resolution of our data allowed us to also estimate the total numbers and the densities of c-fos-GFP+ cells per anatomical ROIs. Since the z planes in the 280-section data sets are spaced 50 mm apart, we transformed the serial 2D data into 3D whole-brain estimates with a stereological method (Williams and Rakic, 1988) applied to a high-resolution 5,600-section data set with z spacing of 2.5 mm (Experimental Procedures). The obtained 2D-to-3D conversion factor of 2.5 was then used to multiply the 2D ROI counts in order to  Table S2 for ROI full names. (C and D) Voxel analysis. (C) A distinct voxel activation was observed in the LSr only by female stimulation. (D) VMHvl showed largely overlapping activation by both stimuli, while VMHdm and ARH showed activation only by the male stimulus. (E) PMv is highly activated by the female stimulus, while the medial part of PMv was also activated by the male stimulus. See also Movie S3 for the full data set. estimate the total numbers of c-fos-GFP+ cells, and the total counts were divided by the ROI volumes in order to estimate the densities of c-fos-GFP+ cells per activated ROIs ( Figure S7).
The average cell density in the structures significantly activated in the female and male data sets were, respectively, 4,993 ± 400 and 4,519 ± 283 per cubic mm (mean ± SEM), whereas the average density in these structures in the handling control was 3,127 ± 201 per cubic mm ( Figure S7). Therefore, the social interactions evoked on average $1,500 to 2,000 cfos-GFP+ cells per cubic mm compared to the handling control, suggesting a sparse activation of a few percent of neurons per brain areas (see Discussion).

DISCUSSION
While the general organization of the brain structures regulating sexual and aggressive behavior is beginning to be understood (Anderson, 2012;Sokolowski and Corbin, 2012), much remains unknown about how information is processed from the sensory periphery (the olfactory system in rodents) to give rise to sexspecific behavioral responses. Here, utilizing a pipeline of computational methods, including ROI-based whole-brain mapping of c-fos activation, voxel-based mapping of subregional differences in c-fos activation, and correlation analysis linking ROI activation to behavior, we compared brief female interactionevoked activation in the brain of a male mouse to the activation evoked by brief interaction with a male. Some more salient findings from our analyses are discussed below following the method discussion, while the complete ROI-and voxel-based results are provided as a resource in Tables S1, S2, and S3 and Movie S3.
The Method Pipeline for c-fos-GFP-Based Mouse Brain Screening The entire method pipeline is automated, highly standardized and operates at a reasonably high-throughput: the imaging time per one brain is $21 hr, while the imaged processing and computational analyses take $24 hr that occur in parallel with the STP imaging Vousden et al., 2014).
The first key part of the computational pipeline is the detection of c-fos-GFP+ cells in the STP data sets. We chose to use CNs, because these algorithms rely on the learning procedure to account for signal to noise ratio variability and improved performance is achieved by simply increasing the training data set (V. Jain et al., 2007, IEEE, conference;Turaga et al., 2010). The trained CN performance (F-score = 0.88) was in fact close to human expert performance (F-score = 0.9), demonstrating the power of this approach for analysis of fluorescent labeling in STP tomography-imaged mouse brains. We have also tested two other cell detection methods-cell counting in the Volocity Image analysis software (Perkin Elmer) and cell counting based on watershed algorithm (Kopec et al., 2011)-but these were considerable less reliable (F score < 0.5) compared to the CNbased detection.
The second critical step of the method is the registration of the data sets to the RSTP Brain and the Allen Mouse Brain Atlas. Fixation-induced tissue autofluorescence provides rich image content for the registration by the warping algorithm Elastix (Mattes et al., 2003). As a result, we were able to achieve a high level of precision ($60 mm jitter) for the registration of the experimental data sets to the RSTP brain ( Figure S2). The alignment of the ABA Nissl-stained sections to the RSTP brain was further helped by the use of the transgenic CAG-Keima brain with a cellular fluorescent protein labeling that matched in most brain regions the cellular Nissl signal and by several interneuron-specific reporter mice ) that helped to validate and improve the matching of the labels to specific brain nuclei ( Figure S3). Consequently, the precision ABA labels became closely aligned to the RSTP brain, as judged based on brain landmarks, such as the corpus callosum, hippocampal pyramidal layers, and many structural borders visible in the autofluorescence signal ( Figure S3).
The last part of the method pipeline includes statistical analyses of the brain-wide c-fos-GFP+ cell counts. Since it was first established in rat models of seizure, the inducibility of c-fos has been utilized to map neuronal activation in many behavioral and pharmacological experiments, demonstrating that c-fos can be used as an activity reporter in most if not all areas of the brain (Dragunow and Robertson, 1987;Morgan et al., 1987). The ROI-and voxel-based statistical analyses established here transform the traditional laborious immunostaining or in situ hybridization based c-fos mapping into an automated whole-brain assay.
In addition to the current application, these methods can also be used to detect and quantify other fluorescent protein-expressing transgenic mouse brains by simply training new CN on a different ground-truth data. This makes our pipeline easily adaptable to many other applications in quantitative whole-brain mapping, such as the generation of whole-brain cell counts in cell type-specific GFP reporter mice .

Female-and Male-Evoked Maps of Whole-Brain Activation in the Male Brain
By focusing on the initial period of social exploratory behaviors between a naive male and a novel conspecific female or male mouse, we set out to determine the brain activation patterns that underlie social recognition and sex discrimination in the male brain. Our results revealed that while the brief interactions led to an activation of the expected sex-specific response at the hypothalamic level (indicating that the behaviors were sufficient for correct sex discrimination), the upstream patterns of brain activation strongly diverged between the two stimuli.
At the level of the AOB and MOB signaling, the female stimulus evoked activation of all downstream connected brain structures, while the male stimulus showed activation of all MOB-linked structures but only a subset of the AOB-linked structures. The strong MOB-driven brain activation in both behaviors agrees with the role of volatile signaling in sex discrimination proposed by studies using chemical lesion of the MOE (Keller et al., 2006) or genetic disruption of cellular signaling in the MOE (Mandiyan et al., 2005). The finding that the male and female stimuli activate different parts of the PIR and ENT areas suggests that topologically distinct MOB cortical outputs may discriminate the sexspecific stimuli. This dorsoventral separation is an example of a novel spatial organization in the piriform cortex, which until now has been considered to lack gross sensory input-based topology (Ghosh et al., 2011;Sosulski et al., 2011).
The role of the VNO and AOB-driven activation in social behaviors appears to be less clear than that of the MOE/MOB signaling. Lesioning of the VNO failed to affect sex discrimination in male mice (Pankevich et al., 2004), even though it did impair vocalization after nasal contact with female urine (Bean, 1982), while genetic disruption of VNO signaling caused male-male mounting instead of aggressive behavior without affecting male-female behavior (Stowers et al., 2002). Our data point to a more prominent role of the AOB-connected brain structures Pearson correlation between the time spent in anogenital sniffing and close following and c-fos-GFP cell counts in the regions activated in the male-female and male-male data sets. Significance is based on FDR q value adjusted for multiple comparisons: (+) = 0.01 % FDR q < 0.05; (++) = 0.001 % FDR q < 0.01; and (+++) = FDR q < 0.001.
in the male-female interaction, as MEA, BST, BA, and COApm were all activated in the male-female data set, but only MEA and BA were activated in the male-male data set. The selective BST activation in the male-female data set included the posterior division nuclei (principal, interfascicular, and transverse) proposed to function in reproductive behaviors (Dong and Swanson, 2004), and the magnocellular nucleus of the anterior division proposed to control neuroendocrine functions and pelvic functions, including penile erection (Dong and Swanson, 2006). The female, but not the male, stimulus also evoked activation of brain areas of the striatopallidothalamocortical circuit known to positively regulate behavioral motivation (Ikemoto, 2007;Sesack and Grace, 2010), including the ventral striatum (OT, ACB), ventral pallidum (SI), thalamus (MDm), and prefrontal cortex (ILA, PL, ORB). While we did not detect activation of the dopaminergic neurons of the ventral tegmental area (VTA), which are known to reinforce ACB functions within this circuit during sexual behavior (Ikemoto, 2007;Sesack and Grace, 2010), we did detect activation of the serotonergic DR, which was recently shown to be necessary for ACB functions in social reward (Dö len et al., 2013). The switch between the DR and VTA modulation of ventral striatum may contribute to a transition between exploratory and consummatory male-female behavior.
The analysis of the hypothalamic brain areas revealed activation of structures regulating sexual and aggressive behaviors (Anderson, 2012;Swanson, 2000): the MPN and PMv regulating male reproductive behavior (Simerly, 2002;Yang and Shah, 2014) were selectively activated in the male-female data set, the VMHvl regulating both male sexual behavior and aggression (Anderson, 2012;Lin et al., 2011;Yang et al., 2013) was activated in response to both female and male stimuli, and the VMHdm regulating male defensive behaviors Sokolowski and Corbin, 2012) was activated only in the male-male data set. Since the brief social interactions did not comprise mating or aggression, these data show that the activation of the hypothalamic nuclei can precede the manifestation of these behaviors as part of the male-female and male-male social exploration-based behaviors.

The Quantification of the Whole-Brain Activation Maps at Cellular Level
The cellular resolution of our data also allowed us to search for correlations between behavioral activity and brain activation and to estimate the density of activated cells per brain area.
The correlation between behavior and c-fos activation can be expected to identify the most behaviorally relevant brain regions in which the number of c-fos activated cells reflects the behavioral performance in individual animals. In agreement with this hypothesis, regions correlated to the time spent in anogenital sniffing included mainly amygdalar and hypothalamic areas of the vomeronasal sensory-motor system transforming the chemosensory information into sexual or aggressive behavior (Swanson, 2000), while the brain areas correlated to the time spent in following included the structures linked to behavioral motivation and described above as part of the striatopallidothalamocortical circuit.
The correlation analysis can be used to add functional significance to activated regions that were not previously known to be involved in social behaviors. For example, the activation of the amygdalar IA and CEAc nuclei was correlated to the anogenital sniffing time, while the PT thalamus activation was correlated to following. Since both IA and CEAc can inhibit the medial central amygdala (CEAm), which is the output fear pathway (Pitkänen et al., 1997), these data suggest that the IA and CEAc are activated by chemosensory cues and may act to modulate fear behaviors during social exploration. The PT, a part of the dorsal group of thalamic nuclei, projects to the ACB (Kelley and Stinus, 1984) and may play a role in motivational modulation of the malefemale social behavior.
Finally, the quantification of the numbers of c-fos-GFP+ cells per brain area can provide information about the approximate percentage of neurons behaviorally recruited in the identified brain areas. For example, we observed on average 2-fold increase ($3,600 c-fos-GFP+ cells per cubic mm) in the prefrontal cortical areas in the male-female data sets, compared to the handling control ( Figure S7). Since neuronal density in the mouse cortex is estimated at $80,000-100,000 per cubic mm (Herculano-Houzel et al., 2006;Keller and Carlson, 1999;Meyer et al., 2010), these data suggest less than 5% of neurons is recruited in response to the female social stimulus. Further, as most brain areas showed similar c-fos-GFP+ densities, the behavioral recruitment of a few percent of neurons is likely a general feature of c-fos activation. This may represent c-fos induction occurring only in the most strongly activated cells, and such sparse c-fos induction may be relevant for the proposed sparse coding of sensory inputs (Olshausen and Field, 2004).

Caveats of the Current Study
There are several caveats associated with our study. First, while the behavior is limited to 90 s, our assay cannot determine whether the observed c-fos-GFP induction occurred entirely during this brief time period or whether some downstream activation occurred during a longer time interval. Second, the behavioral paradigm includes both the introduction and removal of the stimulus animal from the home cage of the c-fos-GFP male, and some of the observed activation pattern thus may reflect stress induced by these manipulations. Third, the use of the OVX females in our study restricts the interpretation of the male-female activation data to social exploration that lacks the effects of estrous hormones. Thus, male-female interaction with, for example, estradiol-induced OVX mice may be expected to induce brain activation partially distinct from the one described in the current study. Fourth, since the c-fos-GFP reporter labels $60% of all c-fos+ cells detected by immunostaining, it is possible that some areas with native c-fos activation were missed in our assay. Finally, fifth, c-fos is a member of a family of IEGs regulated by neuronal activity, and the detection of other IEGs (such as Arc, homer-1A, or zif-268) can be expected to reveal partially overlapping activation maps compared to the c-fos-GFP map identified in our paper. Because neuronal activation in some brain areas may induce other IEGs but fail to induce c-fos, the c-fos-GFP-based network of brain areas described here should not be interpreted as a complete brain activation map evoked by social behavior.

CONCLUSIONS
Our method of c-fos-GFP-based screening generates cellularresolution maps of behaviorally evoked whole-brain activation in the mouse. The patterns of female and male interactionevoked brain activation revealed clear separation between the two stimuli, including at the level of brain structures downstream of both volatile and nonvolatile chemosensory signaling. These activation patterns were also markedly different from the activation pattern evoked during nonsocial olfactory-enhanced exploratory behavior. These findings demonstrate that our method can be used for screening behavior-evoked whole-brain activation, and we envision that future experiments will yield brainmap-like descriptions for other innate behaviors, such as aggression and defensive behaviors, or cognitive behaviors, such as attention and decision making. Further, the same method can be applied to genetic mouse models of neurodevelopmental disorders with the aim of identifying circuit deficits underlying changes in social, cognitive, and other higher-order brain functions.

EXPERIMENTAL PROCEDURES Animals
Animal procedures were approved by the Cold Spring Harbor Laboratory Animal Care and Use Committee. The c-fos-GFP mice, Tg(Fos-tTA,Fos-EGFP) line, were obtained from The Jackson Laboratory. In our study, we used the direct c-fos-GFP signal, whereas several other studies used the tTA protein to drive other reporter molecules .

Behavioral Tests and c-fos-GFP Induction Time Course
Heterozygous c-fos-GFP male mice (8-11 weeks old) were individually housed for 1 week before the test. The behavioral stimuli were transfer of the animal to the experimental arena (handling control) or plus introduction of an OVX conspecific female (male-female group), conspecific male (male-male group), 50 ml falcon tube (object group), and 50 ml falcon tube with a side opening in which was cotton ball with isoamyl acetate (1:100 in mineral oil, 40 ml per experiment, freshly made each day). The stimulus was placed in the home cage for 90 s and then removed. The behavior was video-recorded and manually scored. After the behavioral stimulus was removed, the mice remained in the home cage for additional 3 hr and then killed by transcardial perfusion. For the time course of c-fos-GFP induction, isoamylacetate was introduced into the mouse home cage for a brief period of 90 s. The mice were killed at selected time points of 0.5, 1.5, 3, and 5 hr poststimulation.

Brain Preparation, STP Tomography Imaging, and Data Processing
The brains were prepared as described in our previous study . Briefly, the brains were embedded in oxidized 4% agarose, crosslinked, and imaged as 280 serial sections. The raw image tiles were corrected for illumination and stitched in 2D in MATLAB and aligned in 3D in Fiji . The CNs for detection of c-fos-GFP+ cells was trained based on ground truth data marked up by an expert biologist. The CN performance was scored based on the F-score (the harmonic mean of the precision and recall). Stereological procedure was used to calculate how CNs 2D based counting can be converted into 3D counting to calculate the densities of c-fos-GFP+ cells per activated ROIs. 3D registration methods with Elastix were the same as described previously , but with modified parameters. See the Supplemental Experimental Procedures for more details.

c-fos Immunohistochemistry and Comparison to c-fos-GFP+ Cell Counting
Wild-type C57BL/6 mice (8-10 weeks old) underwent the same behaviors as the c-fos-GFP mice of the male-to-female and handling groups. The mice were killed and perfused 1 hr later and the brains were fixed overnight in 4% paraformaldehyde, then cut as 50 mm coronal sections. For immunohistochemistry, sections were exposed to rabbit anti-c-fos antibody (1:10,000, Santa Cruz SC052) and labeled by DAB solution. FIJI (ImageJ) and Volocity (Perkin-Elmer) were used for cell counting.

Statistics
We ran statistical comparisons between different behavioral groups based on either ROIs or evenly spaced voxels. Voxels were overlapping 3D spheres with 100 mm diameter each and spaced 20 mm apart from each other. The cell count of each voxel was calculated as the number of nuclei found within 100 mm from the center of the voxel in all 3D. To account for multiple comparisons across all voxel/ROI locations, we thresholded the p values and reported FDRs. For correlation between c-fos-GFP cell counts and social behavior, Pearson correlation R values were calculated between c-fos-GFP cell counts and time spent in social behaviors. See the Supplemental Experimental Procedures for more details. Turaga, S.C., Murray, J.F., Jain, V., Roth, F., Helmstaedter, M., Briggman, K., Denk, W., and Seung, H.S. (2010). Convolutional networks can learn to generate affinity graphs for image segmentation. Neural Comput. 22, 511-538.

Animals
Animal procedures were approved by the Cold Spring Harbor Laboratory Animal Care and Use Committee. All animals were housed under constant temperature and light conditions (12 hour cycle lights ON: 0600, lights OFF: 1800) and given food and water ad libitum. The c-fos-GFP mice were obtained from the Jackson Laboratory as a double transgenic strain B6;DBA-Tg(Fos-tTA,Fos-EGFP*)1Mmay Tg(tetO-lacZ,tTA*)1Mmay/J, (stock number 008344). The mice were bred with C57BL/6 to remove the tetO-lacZ,tTA* transgene and then continued to backcross to C57BL/6 for >10 generations as Tg(Fos-tTA,Fos-EGFP) line, which, for simplicity, we call c-fos-GFP mice in our study. These mice comprise two transgenes integrated in the same genomic site: Fos-tTA driving the expression of the tetracycline transactivator (tTA) and Fos-EGFP driving c-fos-GFP fusion protein. The c-fos-GFP transgene includes all 4 exons and all introns of the c-fos gene, with the EGFP sequence fused in-frame at NcoI site to the exon 4; this design was successfully used in two other independently generated c-fos reporter transgenic mice (Barth et al., 2004;Schilling et al., 1991;Wilson et al., 2002). In our study we used the direct c-fos-GFP signal, whereas several other studies used the tTA protein to drive other reporter molecules . The comparison between c-fos-GFP expression visualized by STP tomography and native c-fos expression visualized by anti-c-fos immunohistochemistry was used to validate the social interaction-evoked c-fos-GFP induction in several brain region ( Figure  S5).

Behavioral tests
Heterozygous c-fos-GFP male mice were group-housed before the test. One week before experiments, adult (8 -11 week old mice) were transferred to a designated area in the animal room and separated single per cage to lower the variability in baseline c-fos expression, as described . Note that the oneweek adult isolation is too brief to evoked chronic behavioral and neuroendocrine stress responses (Lupien et al., 2009). All experiments were done between 11 am and 12:30 pm, and the mice were killed between 2 pm and 3:30 pm. The behavioral stimuli were: transfer of the animal to the experimental arena (handling control) or transfer of the animal plus introduction of an OVX conspecific female (male-female group), conspecific male (male-male group), 50 ml falcon tube (object group), and 50 ml falcon tube with a side-opening in which was cotton ball with isoamyl acetate (1:100 in mineral oil, 40 μl per experiment, freshly made each day) (olfactory group) (Yang and Crawley, 2009). The stimulus was placed in the home cage for 90 seconds and then removed. The OVX female mice and intruder males were 2-4 month old; the c-fos-GFP male mice were 2-3 month old. Behavioral response of c-fos-GFP male mice to the OVX females were also compared to wild-type females ( Figure S4). The behavior was video-recorded and was manually scored off-line by Avidemux for the time spent in active social interaction (close following, anogenital sniffing, nose-to-nose touch, and other social sniffing) in the social groups ( Figure S4; Movie S2). After the behavioral stimulus was removed, the mice remained in the home cage for additional 3 hrs and then killed by transcardial perfusion with 0.1 M phosphate-buffered saline (PB) followed by 4% paraformaldehyde (PFA) in 0.1 M PB. Since the handling group and the object were not statistically different, we used only the handling group for comparison to the male-female, male-male and object+odor groups.

Brain preparation and STP tomography.
The brains were prepared as described in our previous study . Briefly, after perfusion the brains were postfixed overnight in 4% PFA at 4 o C, then kept for 48 hrs in 0.1 M glycine / 0.1 M PB, and stored in 0.05 M PB until imaging. The brains were embedded in oxidized 4% agarose in 0.05 M PB using a custom built holder to maintain consistent embedding position. The embedded brains were crosslinked in 0.2% sodium borohydrate solution (in 0.05 M sodium borate buffer, pH 9.0-9.5) and imaged as 280 serial sections, each comprised of a mosaic of 12 x 16 FOVs (X-Y 700 x 700 μm). The raw image tiles (16 bit tif; 70 GB per dataset) were corrected for illumination, stitched in 2D in matlab and aligned in 3D in Fiji . Volocity (Perkin-Elmer) was used to visualize the whole brains and cell counts in 3D.

Automated c-fos-GFP+ cell counting
The selection of the CNs for detection of c-fos-GFP+ cells was done by training on ground truth data marked up by an expert biologist. The ground truth data comprised 72 FOVs randomly selected from a whole-brain dataset.
The CN configuration of 3 hidden layers (each layer was 20 units wide, the output of the network was 1 pixel and the filters were 5 x 5 in size) was chosen after training multiple CNs with different sized filters and different number of units in each layer ( Figure S1). The ground truth data set was then divided into 6 folds (i.e. 5 training sets, each with 60 FOVs; the remaining 12 FOVs were assigned as a test set). Five CNs were trained with each training set. The threshold for each CN output was varied from 0.90 to 0.99 in steps of 0.01 and the threshold value with the maximum F-score was chosen as the threshold for that network. Of the five CNs, the CN with maximum F-score on its test data was chosen for the analysis in the current study. The CN training was done using Cortical network simulator (CNS) (Mutch et al., 2010). The training was accelerated on a NVIDIA GPU.
The CN performance was scored based on the F-score on a dataset of 10 FOVs marked by three experts, which also served to evaluate human inter-expert variability (F score = the harmonic mean of the precision and recall, where precision is the ratio of correctly predicted cells divided by all predicted cells and recall is ratio of correctly predicted cells divided by ground true positive cells; ~1100 c-fos-GFP+ cells were marked). The commonly marked up cells agreed by all experts were used as a ground truth data to score the CN performance. To calculate inter-experts variability, each expert mark-up was set as a ground truth to score the other two experts and the averaged recall and precision from the three comparisons was used as the final inter-experts F score.
In the CN output images, signal smaller than 40 μm 2 was removed as noise and single c-fos-GFP+ cells were identified as circles of radii 4 to 14 μm. In this study, we did not analyze c-fos-GFP induction in the cerebellum by CN, because of a high a false positive rate due to a cellular autofluorescence specific to this brain region. We have analyzed cerebellum by visual inspection and detected only a few c-fos-GFP+ cells in either the social or object groups, suggesting a lack of c-fos induction in this brain region in our experiments (data not shown). In addition, the analysis of c-fos-GFP in the olfactory bulb was done using a separate CN trained specifically on OB images. This was because OB granule cells are tightly packed and smaller in diameter, which makes the OB signal not well comparable to the rest of the mouse brain.
For CN data analysis the brightness of the signal of each sample was normalized by the mean and standard deviation of tissue autofluorescence signal from a coronal section at a bregma position of +0.20 mm.

c-fos-GFP density calculation
The following stereological procedure (Williams and Rakic, 1988) was used to generate the 2D-to-3D conversion ratio for calculating the densities of c-fos-GFP+ cells per activated ROIs. First, one brain was imaged at xyz resolution 1 x 1 x 2.5 µm (i.e. as a 5600 serial section dataset) and c-fos-GFP+ cells were manually counted in 3D in fourteen "counting boxes" of 300 x 300 x 50 µm (xyz) randomly selected from the whole brain. Second, the 2D-to-3D conversion was derived by dividing the manual 3D counts by single 2D counts measured by the CN from the middle section in each box. Third, the obtained conversion factor of 2.5 was used to multiply the 2D ROI counts in order to estimate the total numbers of c-fos-GFP+ cells, and the total counts were divided by the ROI volumes in order to estimate the densities of c-fos-GFP+ cell.

c-fos immunohistochemistry and manual c-fos-GFP+ cell counting
Wild type C57BL/6 mice (8 to 10 week old) underwent the same behaviors as the c-fos-GFP mice of the social (OVX female) and handling groups. The mice were killed and perfused 1 hour later and the brains were fixed O/N (overnight) in 4% PFA, then cut as 50 micron coronal sections and stored in cryoprotectant (30% ethylenglycol and 25% glycerol in 0.05 M PB) at -20 °C. For immunohistochemistry, sections were washed 3 x 10 min in 0.1 M PBS, incubated in 1.5% H 2 O 2 in 0.1 M PBS for 15 min, washed 3 x 10 min in 0.1 M PBS, incubated in PBS+ (10% donkey serum and 0.3% triton X-100) for 1 hour, and exposed to rabbit anti-c-fos antibody (1:10000, Santa Cruz SC052) O/N at 4 °C. The following day, the sections were washed 3 x 10 min in 0.1 M PBS, exposed to secondary biotinylated donkey anti-rabbit antibody (1:500) in PBS+ for 1 hour, washed 4 x 10 min in 0.1 M PBS, incubated for 1 hour in elite ABC mixture (6 ml of solution A and B in 1 ml of PBS+, Vector Labs), washed 4 x times 10 min in 0.1 M PBS, exposed to DAB solution (10 mg DAB; Sigma D5905 in 20 ml 0.1 M PBS) for 4 minutes, rinsed with 0.1 M PBS for 1 min, washed 3 x 10 min in 0.1 M PBS, and mounted on slides. Dried section on slides was dehydrated by 2 min each in 50%, 70%, 80%, 90%, 95%, 100% of EtOH, and xylene, and cover-slipped. Sections were imaged by light microscopy (Leica Axiovision, 10x lens). For c-fos IHC cell counting per area, we first used FIJI (imageJ) trakEM2 to segment each anatomical ROI and used Volocity (Perkin-Elmer) for cell counting. For the STP images, we also used FIJI to segment corresponding c-fos IHC analyzed ROI and used CNs detection for cell counting.

Time course of c-fos-GFP induction
Freshly prepared isoamylacetate (1:100 in mineral oil, 40 µl) was added to gauze in 50 ml conical tube with side opening, which was introduced into the mouse home cage for a brief period of 90 sec. Once the tube was removed, the mice were left undisturbed until they were sacrificed by transcardial perfusion at selected time points of 0.5, 1.5, 3 and 5 hours post ISO stimulation. Olfactory bulbs were imaged with STP tomography and cfos-GFP+ cells were quantified in the MOB granular cell layer to examine the time course of c-fos-GFP induction ( Figure S5).

Correlation between c-fos-GFP cell counts and social behavior
Significantly activated brain regions from the male-female and male-male versus the handling group comparisons (with FDR q < 0.01 as cutoff) were selected for the correlation analysis. First, Pearson correlation R values were calculated between c-fos-GFP cell counts and time spent in social behaviors, including anogenital sniffing, close following, nose-to-nose sniffing, and other sniffing. Second, the R values were transformed to one-tailed p-value, which were corrected by FDR for multiple comparison correction 3D brain registration 3D registration methods were the same as described , but with modified parameters. The affine transform was calculated using 4 resolution levels, while the B-spline step used 3 resolution steps. Mattes Mutual information was used as the similarity measure between the moving and fixed images. The image similarity function was estimated and minimized for a set of randomly chosen samples with the images at each resolution in an iterative way. The registration takes 1 hour on 650 × 450 × 300 voxel sized images on 8-core central processing unit (CPU) with 16 Gb RAM. The images involved in the registration have a 20 μm × 20 μm × 50 μm pixel spacing. The entire image warping experiment is set up using Elastix (Klein et al., 2010), an image registration toolbox based on Insight's ITK. The precision of the registration was measured by the displacement of 13 landmark points in 6 different mouse brains after warping each dataset onto the average RSTP brain ( Figure S2).

Statistics
Power analysis. A region of interest (ROI) count is defined as the sum of c-fos-GFP cells within its boundary. For every ROI, we used the CN c-fos-GFP measurements of the experiment samples to estimate the maximum likelihood parameters (μ = mean and θ = shape) of a negative binomial distribution fitted to its count (McCullagh and Nelder, 1989;Venables and Ripley, 2002). These served as the starting points for our Monte Carlo simulations. For each set of estimated parameters, we generated two datasets of a sample size n, the first from a negative binomial with parameters (μ, θ), and the second with parameters (e*μ, θ), where e is a scaling factor quantifying the effect size being introduced. For every region, we repeated this 30,000 times while modifying the effect size over the range 0.1 to 1.5. The power of a statistical test is defined as the probability of achieving a significant result given that the null hypothesis is false. We estimated it in the following way. For every simulated dataset we applied our statistical test (defined below) to obtain a p-value. If the p-value was below our selected significance level(α < 0.05), the test result was deemed significant and assigned a 1, otherwise, it was assigned a 0. An estimate of the statistical power is simply the average of these test results, or put another way, the proportion of significant test results to the total number of tests run at that parameter setting. To determine an 'optimal' sample size for a given effect size (0.6), we plotted the number of ROIs having over 80% power as the sample size was varied over a reasonable range (5 to 30), and chose the n where we observed an 'elbow'. The 'elbow' represents the sample size where the contribution to power gained from adding another sample begins to wane.
Statistical power for the correlation of c-fos-GFP counts and social behaviors follows previously established power curve for the Pearson correlation. With the selected significance level (α < 0.05) and the sample size (N = 26 brains), we obtained >80% statistical power to detect significant correlation values >0.47.
Statistical analysis between experimental groups. We ran statistical comparisons between different behavioral groups based on either ROIs or evenly spaced voxels. Voxels were overlapping 3D spheres with 100 μm diameter each and spaced 20 μm apart from each other. The cell count of each voxel was calculated as the number of nuclei found within 100 μm from the center of the voxel in all 3D. We assumed the cell counts at a given location, Y, follow a negative binomial distribution whose mean is linearly related to one or more experimental conditions, X: E[Y]=α+βX. For example, when testing a social group versus a control group, our X is a single column showing the categorical classification of mouse sample to group id, i.e. 0 for the control group and 1 for the social group (O'Hara and Kotze, 2010;Venables and Ripley, 2002). We found the maximum likelihood coefficients α and β through iterative reweighted least squares, obtaining estimates for sample standard deviations in the process, from which we obtained the significance of the β coefficient. A significant β means the group status is related to the cell count intensity at the specified location. The z-values in our summary tables correspond to this β coefficient normalized by its sample standard deviation, which under the null hypothesis of no group effect, has an asymptotic standard normal distribution. The p-values give us the probability of obtaining a β coefficient as extreme as the one observed by chance assuming this null hypothesis is true. In the current case of three groups, we utilized Tukey's Honest Significance test to adjust the p-values of the group factor coefficients to control for multiple comparisons: group1v2, group1v3 and group2v3. To account for multiple comparisons across all voxel/ROI locations, we thresholded the p-values and reported false discovery rates with the Benjamini-Hochberg procedure (Benjamini and Hochberg, 1995). In contrast to correcting for type I error rates, this method controls the number of false positives among the tests that have been deemed significant.
To compare voxel activation between the female and male stimuli (Figures 3-6), voxels that passed the FDR cutoff 0.05 were pseudo-colored (red and green) for each dataset and the activation maps were overlaid on the RSTP brain (e.g. Figure 3C and D4 point to regions of high autofluorescence, which cause high rates of false positive detection by other tested methods, but not by CNs. Arrows in d2 indicate an example of dim cells that were not detected by CNs. Arrows in d3 show an example of two neighboring cells that were detected by CNs and further separated by the cell separation algorithm. Scale bar in d and d1 = 1 mm and 100 μm, respectively. (E-F) Evaluation of CN and human performance depending on the background autofluorescence. Y axis shows precision, recall, and F score from 10 FOV tiles from the ground truth dataset; X axis shows autofluorescence brightness of the brain regions in the particular tiles. The CN (E) and human (F) performance was overall independent of the background, with the exception of the "darkest tile", which had a lower CN recall of 0.63 (F score 0.76, and precision 0.94). This tile included the caudal olfactory tubercle area, which comprises myelinated fibers passing from the dorsal striatum. This suggests that increased light scattering in areas with myelinated tracks may somewhat lower CN recall performance. Figure 1. (A) Generation of the RSTP Brain. A single STP brain dataset (A1) was registered by 3D affine and B-spline transformations to the z-stack of Nissl-stained ABA coronal sections (A2). This generated the first transformed STP brain dataset (A3) matched in 3D to the Nissl ABA brain. Next, 39 other STP brains (A4) were registered to the transformed STP dataset (A3), generating the averaged RSTP brain (A5) (see Movie S1). (b-c) Validation of 3D registration accuracy. (B) Thirteen unique points were marked up in the RSTP Brain and in 6 sample STP tomography brains. These points were previously identified as unique 3D landmarks in the Waxholm space (Hawrylycz et al., 2011): (1) frontal middle 1, (2) frontal right 2, (3) frontal left 2, (4) anterior commissure right, (5) anterior commissure left, (6) crossing of the anterior commissure, (7) corpus callosum middle, (8) hippocampus middle, (9) interpeduncular nucleus right, (10) interpeduncular nucleus middle, (11) interpeduncular nucleus right, (12) pontine nucleus middle, (13) cortex middle (http://scalablebrainatlas.incf.org/main/coronal3d.php?template=WHS11&). (C) The distance for each point between the RSTP Brain and the 6 sample brains was (mean ± SD): 682.6 ± 327.6 μm and 65.0 ± 39.9 μm before and after 3D registration by affine and B-spline transformation, respectively. (A) Each ABA Nissl-stained section (A1) was registered by 2D transformations to corresponding sections from the reporter CAG-Keima brain (A2) that was previously registered to the RSTP brain. A cellular Nissl-like fluorescent signal of the Keima FP (Kogure et al., 2006) (A2) improved the precision of the alignment between the two datasets (ABA and RSTP brains) (A3). ABA anatomical labels (A4), as based on the original Nissl dataset (A1), were transformed using the 2D parameters from the Nissl to CAG-Keima registration (A1-3). This further improved the precision of the alignment of the ABA labels to the RSTP brain (A5-6). (B-D) Validation of the alignment of the ABA anatomical labels and anatomical structures in the RSTP Brain. First row panel: RSTP brain autofluorescence signal can be used to align the brain surface contour (arrowhead in B), as well as borders of internal structures, such as habenula (arrowhead in C), and zona incerta (arrowhead in D). Second row panel: fluorescence signal from STP brains of H2B-GFP interneuron cell-type reporter mice  registered to the RSTP brain can be used to delineate internal structures not clearly visible in the autofluorescence signal. These include the bed nuclei of the stria terminalis, posterior division, principal nucleus (BSTpr) in the glutamic acid decarboxylase (GAD) brain; Medial amygdalar nucleus, posterodorsal part (MEApd) in the somatostatin (SST) brain; and Lateral mammillary nucleus (LM) in the parvalbumin (PV) brain. Third row panel: the ABA anatomical labels registered to the RSTP brain. Fourth row panel: The ABA anatomical labels registered and overlaid to the RSTP brain.

Figure S4. Characterization of experimental behaviors used in Figures 3 -7.
Characterization of 90-sec interaction between two males (white), male and an OVX female (black), and male and an intact female (gray). There was no significant difference between male interaction with OVX female or intact female (n = 11). The male spent more time interacting with OVX female (n = 13) than another male (n = 13) in the total interaction time (37.7 ± 3.5 and 18.1 ± 4.2; p = 0.001), anogenital sniffing (17.3 ± 1.8 and 7.9 ± 2.0; p = 0.002) and close following (8.9 ± 1.8 and 2.3 ± 1.0; p = 0.004). No significant difference was seen in nose-to-nose sniffing (3.2 ± 0.6 and 3.0 ± 0.8; p = 0.8) and other sniffing (8.3 ± 1.9 and 3.0 ± 0.8; p = 0.16). All values are in seconds. Figures 3-7. (A) c-fos-GFP induction in the olfactory bulb was examined at 0.5, 1.5, 3 and 5 hours after 90 sec ISO stimulation (Methods). (B) CNs detection of cfos-GFP in the boxed area from (a). Note the highest number of c-fos-GFP+ cells at 3 hours after the stimulation. (C). Quantification (mean ± SD) of the c-fos-GFP+ cell counts in the MOB granular cell layer shows a peak of induction at about 3 hours, which returns to the baseline by 5 hours post ISO stimulation. (D-I) Comparison of c-fos-GFP and native c-fos induction. (D-F) Representative images of c-fos-GFP labeling in the ORBm cortex after handling (D) and social behavior (E) imaged by STP tomography; the panel (F) shows the location of the zoomed-in views in the corresponding coronal section. (G-H) Representative images of anti-c-fos immunohistochemistry from C57BL/6 mice in the matching ORBm cortex after handling (G) and social stimulation (H) imaged by bright field microscopy. Scale bar = 100 µm. (I) Quantitation of c-fos+ and c-fos-GFP+ cell counts in eight selected regions: ILA (infralimbic area), PL (prelimbic area), ORBm (orbital medial cortex), BLAa (anterior basal lateral amygdala), COApl (cortical amygdala, posterior lateral), MEA (medial amygdalar nucleus), DG (dentate gyrus), and PIR (piriform cortex); all values are mean ± SEM; asterisk = p < 0.05; n.s. = not significant. On average, the c-fos-GFP cell counts detected by STP tomography represented ~59% of the c-fos counts detected by immunohistochemistry in which the c-fos protein signal is enhanced by antibody staining. Importantly, both STP tomography and immunohistochemistry detected comparable induction changes between the female and handling groups: ILA = 1.8 and 2.1, PL = 1.9 and 2.8, ORBm = 2.4 and 2.7, BLAa = 2.0 and 1.9, COApl = 2.6 and 2.4, MEA = 1.9 and 2.0; DG = 1.2 and 1.3, and PIR = 2.2 and 1.8.

Figure S6. Principles of voxel based statistical analysis used in Figures 3 -6.
(A-C) Mean voxel-based c-fos-GFP+ cell counts of 13 brains from the handling (A), male-female (B), and male-male (C) groups. The voxels are spheres of 100 µm diameter, with 20 µm spacing; the brightness of the signal is based on the number of cells per voxel, as shown in the heat map index in (A). (D-E) The statistically significant voxels from the handling to the male-female group (D), and from the handling to the male-male group comparison (E) are color-coded according to the level of the statistical significance, as shown in heat map index in (D). (F) Activated voxels (with FDR q < 0.05) were binarized, given different color depending on stimulation (red for male-female and green for male-male evoked activation), and overlaid in the RSTP. The arrow points to an example of a large hotspot of activation in the medial orbital cortex. See also Movie S3; the A/P bregma position is indicated at the lower left, the grid for M/L and D/V bregma position is overlaid).

Figure S7. c-fos-GFP cell density in male-and female-stimulus evoked brain regions based on results shown in Figures 3 -7.
Density of significantly activated brain regions by either male-(green bar) or femalestimulation (red bar) in comparison to handling groups (blue bar) is displayed in the whole brain. Red, green, yellow box, and brown bar next to the ROI indicates that the corresponding ROIs were activated by the female stimulation, male stimulation, both the female and male stimulations, and ISO stimulation with FDR cutoff 0.01, respectively. "Average" in the last bar graph represents the mean values of the all significantly activated ROIs. Table S1. ROI-based analysis of c-fos-GFP induction of the handling, male-female, male-male and ISO groups against the baseline group, related to Figure 3 -7. The Table S1a shows the comparison between the handling and baseline groups, the Table S1b shows the male-female and baseline comparison, the Table S1c shows the male-male and baseline comparison, and the Table S1d shows the ISO and baseline comparison. The ROIs with statistically increased c-fos-GFP induction are color-coded: light-blue for 1 x 10 -2 to 1 x 10 -3 , green for 1 x 10 -3 to 1 x 10 -5 , and red for <1 x 10 -5 . Column A = the abbreviation names of the ABA ROIs; column B = the full name of the ABA ROIs; column C = the unique numerical ID for each ROI; column D = the hierarchical structure order of each ROI; columns E-H = mean and SD for each ROI in the corresponding experimental groups; columns I-K = the statistical z-scores, uncorrected p values, and corrected FDR q value, respectively; column M-N shows color coding information based on FDR values. The anatomical location of ROIs can be easily viewed in online Allen Brain Atlas (http://atlas.brain-map.org). Table S2. ROI-based analysis of c-fos-GFP induction of the male-female, male-male and ISO groups against the handling group, related to Figure 3 -7. The Table S2a shows the comparison between the male-female and handling groups, the Table S2b shows the male-male vs. handling group comparison, and the Table S2c shows the ISO vs. handling group comparison. The format of the table is the same as the Table  S1. Table S3. ROI-based analysis of c-fos-GFP induction of the male-female and malemale groups against ISO group, related to Figure 3 -7. The Table S3a shows the comparison between the male-female and ISO groups with male-female higher ROI highlighted, S3b shows the comparison between the malefemale and ISO groups with ISO higher ROI highlighted, S3c shows the comparison between the male-male and ISO groups with male-male higher ROI highlighted, and S3d shows the comparison between the male-male and ISO groups with ISO higher ROI highlighted. The format of the table is the same as the Table S1.