Regional Diversity in the Postsynaptic Proteome of the Mouse Brain

The proteome of the postsynaptic terminal of excitatory synapses comprises over one thousand proteins in vertebrate species and plays a central role in behavior and brain disease. The brain is organized into anatomically distinct regions and whether the synapse proteome differs across these regions is poorly understood. Postsynaptic proteomes were isolated from seven forebrain and hindbrain regions in mice and their composition determined using proteomic mass spectrometry. Seventy-four percent of proteins showed differential expression and each region displayed a unique compositional signature. These signatures correlated with the anatomical divisions of the brain and their embryological origins. Biochemical pathways controlling plasticity and disease, protein interaction networks and individual proteins involved with cognition all showed differential regional expression. Combining proteomic and connectomic data shows that interconnected regions have specific proteome signatures. Diversity in synapse proteome composition is key feature of mouse and human brain structure.


Introduction
Synapses are the specialized junctions between nerve cells and are present in vast numbers in the mammalian nervous system. During the 1990s, synapses were thought to be relatively simple connectors, but the application of proteomic mass spectrometry in 2000 revealed an unanticipated complexity in their protein composition [1]. Both the presynaptic and postsynaptic proteomes have since been systematically characterized in several vertebrate species and thousands of proteins have been identified [2][3][4][5][6][7][8][9][10][11][12][13]. Phosphoproteomic studies have shown that neural activity causes changes in large numbers of proteins [14,15]. These findings have transitioned the view of synapses to one where they are highly sophisticated and complex signaling machines that process information. The importance of understanding this complexity is underscored by the finding that over 130 human brain diseases are caused by mutations disrupting postsynaptic proteins [16,17].
It is of fundamental importance to understand how the high number of postsynaptic proteins are organized physically (within synapses) and spatially (between synapses). Biochemical studies have shown that postsynaptic proteins are typically assembled into a hierarchy of complexes and supercomplexes (complexes of complexes) [18][19][20]. The prototype of postsynaptic supercomplexes are those formed by the scaffolding protein PSD95 (also known as Dlg4). Dimers of PSD95 assemble with complexes of neurotransmitter receptors, ion channels, signaling and structural proteins into a family of high molecular weight (1)(2)(3) structures in excitatory synapses [18][19][20]. PSD93 (also known as Dlg2), which is a paralog of PSD95, co-assembles with PSD95 to bind N-methyl-D-aspartate (NMDA) receptors and these are an important functional subset of the PSD95 supercomplex family. Other combinations of proteins form other subtypes of PSD95 supercomplexes, such as those containing potassium channels and serotonin receptors [20][21][22][23]. Together, these members of the PSD95 supercomplex family confer diverse signal processing functions to the synapse.
The principles underlying the spatial organization of synapse proteomes in the brain is less well understood. To date, most studies of the synapse proteome have focused on defining composition from limited regions of the brain (or the whole brain). However, at the macroscopic level, brain architecture is characterized by regions with distinct functions [24]. It is therefore of importance to ask if synapse proteomes differ between brain regions and whether any differences might be relevant to their function or to the connectivity between these regions. In a recent study, we reported that regions of human neocortex differ in the composition of their postsynaptic proteomes and that these compositional differences correlate with functional properties [25]. The present study employs a similar analysis applied to the mouse brain, which allows us to ask if conserved principles may apply across these two species that evolved from a common ancestor~90 million years ago.
Using a method suitable for the isolation and direct quantification of mouse synapse proteomes from small amounts of brain tissue, we compared and contrasted the synapse proteomes isolated from seven integral regions of the adult mouse brain. The postsynaptic proteome was analyzed to a depth of 1173 proteins and differential expression signatures were identified and characterized in each brain region. We used these datasets to analyze the spatial organization of the postsynaptic proteome in the mouse brain and identify organizational principles shared with humans [25,26]. This large-scale dataset is a useful resource for the field of neuroscience and future studies using mouse models of human disease.

Dissections of Mouse Brain Regions
This study was performed using 8-week-old male C57BL/6J mice. All experimental protocols involving the use of animals were performed in accordance with recommendations for the proper care and use of laboratory animals and under the authorization of the regulations and policies governing the care and use of laboratory animals (EU directive No. 86/609 and Council of Europe Convention ETS123, EU decree 2001-486 and Statement of Compliance with Standards for Use of Laboratory Animals by Foreign Institutions No. A5388-01, National Institutes of Health, Bethesda, MD, USA).
The mice (n = 6) were anesthetized with a pentobarbital dose of 40 mg/kg body weight and sacrificed by decapitation. The brains were rapidly removed and kept on ice while the areas of interest were dissected from the right hemisphere using the microdissection method of Palkovits [27]. Large regions were collected from the frontal, medial and caudal cortex, as well as the right caudate putamen, right hippocampus, whole hypothalamus, and cerebellum (right half), which was cut previously through the vermis ( Figure S1). The samples were frozen on liquid nitrogen and stored at −80 • C until processed.

PSD Isolation and Protein Preparations for Mass Spectrometry
Dissected mouse brain regions were homogenized by performing 12 strokes with a Dounce homogenizer containing 2 mL ice-cold homogenization buffer (320 mM sucrose, 1 mM HEPES, pH 7.4) containing 1× Complete EDTA-free protease inhibitor (Roche) and 1× Phosphatase inhibitor cocktail set II (Calbiochem). Synaptosomes were isolated from homogenized mouse brain tissue as described [2]. Briefly, insoluble material was pelleted by centrifugation at 1000× g for 10 min at 4 • C. The supernatant (S1) was removed and the pellet resuspended in 1 mL homogenization buffer and an additional six strokes were performed. Following a second centrifugation at 1000× g for 10 min at 4 • C, the supernatant (S2) was removed and pooled with S1. The combined supernatants were then centrifuged at 18,500× g for 15 min at 4 • C. The pellet was resuspended in 0.25 mL homogenization buffer and 0.25 mL extraction buffer (50 mM NaCl, 1% DOC, 25 mM Tris-HCl, pH 8.0) containing 1× Complete EDTA-free protease inhibitor (Roche) and 1× Phosphatase inhibitor cocktail set II (Calbiochem) and incubated on ice for 1 h. The resulting PSD extracts were centrifuged at 10,000× g for 20 min at 4 • C and the resulting supernatant filtered through a 0.2 µm syringe filter (Millipore).

Sample Preparation and LC-MS/MS Analysis
All chemicals were purchased from Sigma-Aldrich unless otherwise stated. Acetonitrile and water for HPLC-MS/MS and sample preparation were HPLC quality and were purchased from Thermo Fisher Scientific (Loughborough, UK). Formic acid was supra-pure (90-100%) purchased from Merck KGaA (Darmstadt, Germany) while trypsin sequencing grade was purchased from Promega (Southampton, UK). All HPLC-MS connector fittings were either purchased from Upchurch Scientific (Hichrom) or Valco (RESTEK). Fifty micrograms of PSD proteins were acetone precipitated, protein pellets reconstituted in SDS-PAGE loading buffer, and briefly run on a 4-12% Bis-Tris gradient gel (Invitrogen) for~10 min. Proteins were in-gel digested using a method similar to that of Shevchenko et al. (2006) [28]. Resulting peptide extracts were then acidified with 7 µL 0.05% TFA and were filtered with a Millex filter (Millipore) before HPLC-MS analysis. Nano-HPLC-MS/MS analysis was performed using an on-line system consisting of a nano-pump (Dionex Ultimate 3000, Thermo Fisher) coupled to a QExactive instrument (Thermo Fisher) with a pre-column of 300 µm × 5 mm (Acclaim Pepmap, 5 µm particle size) connected to a column of 75 µm × 50 cm (Acclaim Pepmap, 3 µm particle size). Samples were analyzed on a 90-min gradient in data-dependent analysis (one survey scan at 70 k resolution followed by the top ten MS/MS).

Mass Spectrometry and Data Analysis
Data from MS/MS spectra were searched using MASCOT version 2.4 (Matrix Science Ltd., London, UK) against the Mus musculus subset of the National Center for Biotechnology Information (NCBI) protein database (382,487 protein sequences) with maximum missed-cut value set to 2. The following features were used in all searches: (i) variable methionine oxidation; (ii) fixed cysteine carbamidomethylation; (iii) precursor mass tolerance of 10 ppm; (iv) MS/MS tolerance of 0.05 amu; (v) significance threshold (p) below 0.05 (MudPIT scoring); and (vi) final peptide score of 20.
Progenesis version 4 (Nonlinear Dynamics, Newcastle upon Tyne, UK) was used for HPLC-MS label-free quantitation. Only MS/MS peaks with a charge of 2+, 3+ or 4+ were considered for the total number of "Feature" (signal at one particular retention time and m/z) and only the five most intense spectra per "Feature" were included. Each LC-MS run was normalized by multiplying a scalar factor. The scalar factor is a ratio in log space of the median intensity of the selected features against the median intensity of the selected feature of a reference spectrum. The associated unique peptide ion intensities for a specific protein were then summed to generate an abundance value and transformed using an ArcSinH function. Based on the abundance values, within group means were calculated and from there the fold changes (in comparison to control) were evaluated. One-way analysis of variance (ANOVA) was used to calculate the p-value based on the transformed abundance values. p-values were adjusted for multiple comparisons and were calculated either from Progenesis version 4 (Nonlinear Dynamics) or using R (R Core Team, 2013) [29] based on Benjamini and Hochberg (1995) [30]. Further analysis was performed by extracting a Z-score calculated on ArcSinH average group.
Differentially expressed proteins were only considered significant in the current study if the following conditions were fulfilled: (i) adjusted p-values (pairwise) less than 0.05; (ii) number of unique peptides detected and used in quantification per protein was at least 2 for the 1173 dataset; and (iii) absolute fold change was at least 1.3 for differentially abundant proteins and ≤0.667 for downregulated proteins.

Bioinformatic Analyses
The majority of the analysis was performed in the R software environment for statistical computing and graphics. Principal component analysis (PCA) and Tukey test was performed with R package FactoMineR and correlation analysis with the package corrplot. Differential stability (DS) analysis was performed as described [31]; briefly, for each protein from the list of 1173, the average Pearson correlation coefficient was estimated from 14 pairwise Pearson coefficients for six brain samples. Based on DS, Tukey and PCA analyses, we determined that the data from all six individuals could be combined and the mean protein abundances were then used for all downstream analyses. Heatmaps were generated with use of the heatmap.2 function from gplot R library: parameters were set to default values with the exception of label and dendrogram visualization control. Hierarchical clustering validation and comparison of dendrograms were performed with package dendextend [32]. The number of stable clusters was independently assessed with nbclust package [33], which provides the list of indices to determine the optimal number of clusters. We selected a set of six clusters [the postsynaptic proteome modules (PPMs)] based on the best combination of indices provided by nbclust R package. Individual proteins in each of the six PPMs and their abundances across all seven integral regions are listed in Table S5 and proteins in each module were ranked by their abundance in each of the seven regions of the mouse brain examined.
GO function and KEGG pathway enrichment for all proteins was performed using DAVID (https://david.ncifcrf.gov/). Disease enrichment for each brain region and each protein module was performed using DAVID (https://david.ncifcrf.gov/) and KEGG pathway enrichment was then performed by searching ranked protein lists obtained using GSea version 2.1.0 (http://software. broadinstitute.org/gsea/index.jsp) as previously described [35].
Circular hierarchical clustering of protein modules for the visualization of inter-region molecular interactions was performed using Circos (http://circos.ca) [36]. A Circos configuration file was created representing brain regions as "karyotypes". All proteins were grouped into "modules" according to their abundance similarity. Proteins that have positive abundance in more than one region were shown as links between regions. The width of each link is proportional to the fraction of the regional proteins that contributed to the link, while its color corresponds to that of the respective "module". All preprocessing of the relative abundance information and generation of appropriate Circos files were performed in R. Scripts are available on request.
For DS analysis, we used the described approach [31] on the MS intensity values obtained for all 1173 proteins identified with a minimum of two unique peptides in order to identify synaptic proteins with highly reproducible expression patterns across all six independent mouse brains. The average pairwise Pearson correlation ρ over the six individual mouse brains was quantified and obtained DS values ranged from 0.96 to −0.22 (avgCor, Table S3). As DS reflects the tendency of a gene to exhibit reproducible differential expression relationships across brain structures, the higher DS value represents a more reproducible relationship.
For correlation with mesoscale mouse connectome data [37], the mean voxel sum for each region was calculated with respect to all other regions and itself. The correlation of this matrix was then estimated with the matrix of protein abundances. The results are listed in Table S3.

Protein Interaction Identification and Mapping
The full postsynaptic proteome network was built from the list of 1173 proteins obtained in this study and protein-protein interactions (PPIs) obtained by mining publicly available databases: BioGRID [38], IntAct [39] and DIP [40] both for mouse and human. The total network consists of 1016 proteins and 8105 PPIs. We applied weights to each interaction based on abundance values for specific brain regions as follows: mean (ExpA, ExpB), so that for each of the regions the specific weight for each of the interactions could be determined. Having varying abundances for interacting proteins in different brain regions, we estimated the region-specific edge that resulted in region-specific PPI. Each brain region network was clustered making use of the spectral properties of the network; the network being expressed in terms of its eigenvectors and eigenvalues, and partitioned recursively (using a fine-tuning step) into communities based on maximizing the clustering measure modularity [41][42][43]; the modularity of the networks was found to be 0.28-0.42. Modularity (Q) measures the quality of a network division into communities from the number of edges found relative to the number expected if placed at random. The modularity value lies in the range 0, which indicates clustering no better than random, to 1, with typical values for real networks ranging from 0.3 to 0.7 [42].
Enrichment for biological process and cellular component was performed using the topGO package (https://bioconductor.org/packages/release/bioc/html/topGO.html), while functional enrichment of synaptic proteins/gene groups that are known risk factors for schizophrenia was performed using the published schizophrenia risk factor dataset [44].
The stability of all interactions across the region were assessed by comparison of clustering results for each of seven region-specific networks and assigning each interaction a score of 0 if both proteins appeared in the same cluster and a score of 1 if it appeared in a different cluster. Scores were summed over all seven regions, resulting in sums ranging from 0 (proteins remain in the same cluster in all regions) to 7 (proteins never appear in the same cluster). For the "stable" network, we selected interactions with scores ≤2, which means that they persist in the same cluster in 5/7 (70%) regional networks.
For disease enrichment analysis, the community and protein robustness values within the range 0-1 were taken as edge weights. Each region network was then clustered and cluster enrichment was assessed using the TopOntop package (https://github.com/hxin/topOnto) and OMIM/Ensemble Var/genetic annotation data. For disease enrichment the annotation data were standardized using MetaMap [45][46][47] and NCBO Annotator (https://www.bioontology.org/annotator-service) to recognize terms found in the Human Disease Ontology (HDO) [48]. Recognized enriched disease ontology terms were then associated with gene identifiers and stored locally. Disease term enrichment could then be calculated using the topology-based elimination Fisher method [49] found in the topGO package (http://topgo.bioinf-mpi-inf.mpg.de/), together with the standardized OMIM/GeneRIF/Ensembl variation gene-disease annotation data (17,731 gene-disease associations), and the full HDO tree (3140 terms). Each region was then examined individually by performing the clustering analysis and enrichment for each of the clusters identified in each of the seven brain regions.

Quantification of Postsynaptic Proteins from Brain Regions
Seven integral brain regions within the forebrain (prosencephalon) and hindbrain (rhombencephalon) were dissected from six eight-week-old C57BL/6J mice ( Figure 1A and Figure S1). Forebrain regions included telencephalic structures: frontal cortex (CxF), medial cortex (CxM), caudal cortex (CxCA), hippocampus (Hip), and striatum (ST); the hypothalamus (Hyp) represented a diencephalic structure; and the hindbrain was represented by cerebellum (CB). These represent major brain regions with different structural and functional attributes and which can be relatively easy dissected from the brain.
PSD fractions were prepared from six mice and all 42 samples were analyzed using LC-MS/MS. Label-free quantitation of peptide intensity identified 1173 proteins across all seven brain regions (Table S1). We found a significant overlap between our dataset and those obtained in other mouse studies [4][5][6][10][11][12]22] (Figure S2A,B); the 61 proteins unique to this study are summarized in Table S2.
To determine the validity of pooling data from six mice, we performed several analyses. We first used the differential stability (DS) approach, which has been previously applied to transcriptomic and proteomic analyses of adult human brain regions [25,50] (Table S3). For this, we estimated the average pairwise Pearson correlation to identify the proteins that demonstrate similar patterns across all six brains. From 1173 synaptic proteins, roughly half (572) displayed a high DS correlation, or similar expression patterns across all brain regions (Table S3A,B). We found that these were functionally enriched in synaptic transmission proteins (q = 8.25 × 10 −19 ), ATP metabolic processes (q = 1.33 × 10 −12 ) and calcium ion transporting proteins (q = 2.02 × 10 −9 ). Proteins involved in pathways associated with learning (q = 8.08 × 10 −6 ), memory (q = 6.23 × 10 −3 ) and behavior (q = 1.39 × 10 −7 ) were also over-represented in this high DS subset. Components of several KEGG pathways were also highly correlated between individuals, including long-term potentiation (q = 7.95 × 10 −7 ), calcium signaling pathways (q = 1.13 × 10 −5 ), Huntington's (q = 2.66 × 10 −7 ), Alzheimer's (q = 2.14 × 10 −5 ) and Parkinson's disease (q = 8.95 × 10 −6 ) ( Table S3C). The most highly conserved proteins among the six individual mice with the highest DS values were STX1A (p = 0.96), STUM (p = 0.96), CDH13 (p = 0.95) and ATP1B2 (p = 0.95) (Table S3B). We also compared the distribution of synaptic protein abundances across individual mice by principal component analysis (PCA). This analysis ( Figure  S3A) indicates that the synaptic proteome of the mice largely overlap, with brains A-C corresponding to the central region of the distribution. As Tukey's HSD test shows no significant difference in the mean values of six individuals at a confidence level of 95% ( Figure S3B), we determined that the data from all six individuals could be combined and the mean protein abundances were then used for all downstream analyses.
Proteomes 2018, 6, x 6 of 18 average pairwise Pearson correlation to identify the proteins that demonstrate similar patterns across all six brains. From 1173 synaptic proteins, roughly half (572) displayed a high DS correlation, or similar expression patterns across all brain regions (Table S3A,B). We found that these were functionally enriched in synaptic transmission proteins (q = 8.25 × 10 −19 ), ATP metabolic processes (q = 1.33 × 10 −12 ) and calcium ion transporting proteins (q = 2.02 × 10 −9 ). Proteins involved in pathways associated with learning (q = 8.08 × 10 −6 ), memory (q = 6.23 × 10 −3 ) and behavior (q = 1.39 × 10 −7 ) were also over-represented in this high DS subset. Components of several KEGG pathways were also highly correlated between individuals, including long-term potentiation (q = 7.95 × 10 −7 ), calcium signaling pathways (q = 1.13 × 10 −5 ), Huntington's (q = 2.66 × 10 −7 ), Alzheimer's (q = 2.14 × 10 −5 ) and Parkinson's disease (q = 8.95 × 10 −6 ) ( Table S3C). The most highly conserved proteins among the six individual mice with the highest DS values were STX1A (p = 0.96), STUM (p = 0.96), CDH13 (p = 0.95) and ATP1B2 (p = 0.95) (Table S3B). We also compared the distribution of synaptic protein abundances across individual mice by principal component analysis (PCA). This analysis ( Figure S3A) indicates that the synaptic proteome of the mice largely overlap, with brains A-C corresponding to the central region of the distribution. As Tukey's HSD test shows no significant difference in the mean values of six individuals at a confidence level of 95% ( Figure S3B), we determined that the data from all six individuals could be combined and the mean protein abundances were then used for all downstream analyses.

Regional Differences in Postsynaptic Proteome Composition
To identify postsynaptic proteins with differential expression between brain regions, proteins having a mean peptide intensity of 1.5-fold or greater in one brain region compared with any other and determined to be significant with p < 0.05 were identified (Table S4). Eight hundred sixty-eight (74%) proteins were found to be differentially expressed in at least one region compared with all others (Table S4). The regions with the largest number of differentially expressed proteins were the cerebellum (251), hypothalamus (243) and striatum (161). By contrast, the frontal (14), medial (70) and caudal cortex (34) were found to contain the lowest number of differentially expressed proteins compared with all other regions ( Figure 1C).

Regional Differences in Postsynaptic Proteome Composition
To identify postsynaptic proteins with differential expression between brain regions, proteins having a mean peptide intensity of 1.5-fold or greater in one brain region compared with any other and determined to be significant with p < 0.05 were identified (Table S4). Eight hundred sixty-eight (74%) proteins were found to be differentially expressed in at least one region compared with all others (Table S4). The regions with the largest number of differentially expressed proteins were the cerebellum (251), hypothalamus (243) and striatum (161). By contrast, the frontal (14), medial (70) and caudal cortex (34) were found to contain the lowest number of differentially expressed proteins compared with all other regions ( Figure 1C).
Hierarchical clustering of all proteins revealed that each region has a unique signature of expression. Moreover, these signatures are organized in line with the classical anatomical architecture of the brain: the three cortical regions showed greatest similarity, and the next most similar region was the hippocampus, then striatum, hypothalamus and cerebellum ( Figure 1B). This clustering reflects the embryological divisions of the vertebrate brain into telencephalon, diencephalon and rhombencephalon ( Figure 1C). Moreover, these results complement findings in the human neocortex, where unique signatures were also found for each region. Together, these findings indicate that compositional differences in the postsynaptic proteome reflect, at least in part, the embryological patterning mechanisms that define brain regions.
This clustering approach also allowed us to examine region-specific functions. We identified six sets of proteins, which we call postsynaptic proteome modules (PPM 1-6) ( Figure 1B and Table S5). As indicated by the clustering and Circos plots ( Figure S4), these PPMs were differentially distributed in brain regions. To understand the functional significance of differential protein expression in modules and regions, we analyzed the KEGG biochemical and disease pathways in PPMs (Figure 2A). The PPMs showed differential composition of pathways. For example, neurodegenerative diseases were found in PPM1, whereas synaptic plasticity (long-term potentiation and long-term depression) and relevant signaling pathways were in PPM2.
Examination of KEGG pathway enrichment in brain regions ( Figure 2B) revealed three major groups. It is striking that very similar groupings were observed in the analysis of human neocortical regions. Group 1 contained terms including MAPK, chemokine, neurotrophin pathways; Group 2 included synaptic plasticity mechanisms and calcium signaling; and Group 3 included neurodegenerative diseases (Alzheimer's, Huntington's, and Parkinson's) and metabolic mechanisms (glycolysis/gluconeogenesis and oxidative phosphorylation). These findings suggest that biochemical pathways in the postsynaptic proteome are differentially distributed across brain regions and that the mechanisms controlling this distribution are species conserved. Hierarchical clustering of all proteins revealed that each region has a unique signature of expression. Moreover, these signatures are organized in line with the classical anatomical architecture of the brain: the three cortical regions showed greatest similarity, and the next most similar region was the hippocampus, then striatum, hypothalamus and cerebellum ( Figure 1B). This clustering reflects the embryological divisions of the vertebrate brain into telencephalon, diencephalon and rhombencephalon ( Figure 1C). Moreover, these results complement findings in the human neocortex, where unique signatures were also found for each region. Together, these findings indicate that compositional differences in the postsynaptic proteome reflect, at least in part, the embryological patterning mechanisms that define brain regions.
This clustering approach also allowed us to examine region-specific functions. We identified six sets of proteins, which we call postsynaptic proteome modules (PPM 1-6) ( Figure 1B and Table S5). As indicated by the clustering and Circos plots ( Figure S4), these PPMs were differentially distributed in brain regions. To understand the functional significance of differential protein expression in modules and regions, we analyzed the KEGG biochemical and disease pathways in PPMs ( Figure  2A). The PPMs showed differential composition of pathways. For example, neurodegenerative diseases were found in PPM1, whereas synaptic plasticity (long-term potentiation and long-term depression) and relevant signaling pathways were in PPM2.
Examination of KEGG pathway enrichment in brain regions ( Figure 2B) revealed three major groups. It is striking that very similar groupings were observed in the analysis of human neocortical regions. Group 1 contained terms including MAPK, chemokine, neurotrophin pathways; Group 2 included synaptic plasticity mechanisms and calcium signaling; and Group 3 included neurodegenerative diseases (Alzheimer's, Huntington's, and Parkinson's) and metabolic mechanisms (glycolysis/ gluconeogenesis and oxidative phosphorylation). These findings suggest that biochemical pathways in the postsynaptic proteome are differentially distributed across brain regions and that the mechanisms controlling this distribution are species conserved.

Distribution of Mechanism of Cognition and Protein Complexes
The seven regions of the brain examined in this study are thought to play distinct but interdependent roles in cognitive function. Therefore, we examined the distribution of 33 selected proteins that are known to play roles in cognition ( Figure 3A). Hierarchical clustering shows that the three cortical regions examined (CxF, CxM, and CxCA) cluster together by similarity, while the Hip region clusters separately from all others. The Hyp and ST regions cluster together by similarity in their abundances of proteins involved in memory and cognition, while the CB clusters separately from all of the other six regions. The abundances of these proteins clustered into two main branches ( Figure 3A).
To assess the heterogeneity of synaptic protein complexes throughout the brain, the abundances of the four MAGUK scaffold protein paralogs Dlg1 (also known as Sap97), PSD93 (Dlg2), Dlg3 (also known as Sap102) and PSD95 (Dlg4) were mapped across the various brain regions. We found that these four molecules, which play fundamental roles in synaptic transmission, were differentially distributed throughout the brain, with Dlg1 being most abundant in the synapses of the CxM and PSD93 most abundant in Hip, CxCA and CxF. By contrast, both Dlg3 and PSD95 showed similar protein abundance profiles across the various brain regions ( Figure 3B).

Distribution of Mechanism of Cognition and Protein Complexes
The seven regions of the brain examined in this study are thought to play distinct but interdependent roles in cognitive function. Therefore, we examined the distribution of 33 selected proteins that are known to play roles in cognition ( Figure 3A). Hierarchical clustering shows that the three cortical regions examined (CxF, CxM, and CxCA) cluster together by similarity, while the Hip region clusters separately from all others. The Hyp and ST regions cluster together by similarity in their abundances of proteins involved in memory and cognition, while the CB clusters separately from all of the other six regions. The abundances of these proteins clustered into two main branches ( Figure 3A).
To assess the heterogeneity of synaptic protein complexes throughout the brain, the abundances of the four MAGUK scaffold protein paralogs Dlg1 (also known as Sap97), PSD93 (Dlg2), Dlg3 (also known as Sap102) and PSD95 (Dlg4) were mapped across the various brain regions. We found that these four molecules, which play fundamental roles in synaptic transmission, were differentially distributed throughout the brain, with Dlg1 being most abundant in the synapses of the CxM and PSD93 most abundant in Hip, CxCA and CxF. By contrast, both Dlg3 and PSD95 showed similar protein abundance profiles across the various brain regions ( Figure 3B).

Correlations of Regional Synapse Proteomes with the Connectome
There are large-scale efforts to map the mouse brain connectome by identifying the projections of neurons between brain regions [37,51]. Because these connections are made at synapses, it follows that there may be a relationship between the molecular composition of synapses in one region and their interconnections. To address this, we asked if the synaptic proteins quantified in this study correlated with connectivity data from the Allen Brain Institute's Mouse Brain Connectivity Atlas (mesoscale connectome) [37] (Figure 4, and Table S6). Hierarchical clustering of postsynaptic proteome abundance and connection strength approximated from projection volume shows that regional connections are associated with distinct signatures of proteins. Moreover, two major branches separated cortex, striatum and hippocampus from cerebellum and hypothalamus, suggesting that hindbrain and basal forebrain connections have broadly distinct molecular properties compared with connections of other forebrain structures.

Correlations of Regional Synapse Proteomes with the Connectome
There are large-scale efforts to map the mouse brain connectome by identifying the projections of neurons between brain regions [37,51]. Because these connections are made at synapses, it follows that there may be a relationship between the molecular composition of synapses in one region and their interconnections. To address this, we asked if the synaptic proteins quantified in this study correlated with connectivity data from the Allen Brain Institute's Mouse Brain Connectivity Atlas (mesoscale connectome) [37] (Figure 4, and Table S6). Hierarchical clustering of postsynaptic proteome abundance and connection strength approximated from projection volume shows that regional connections are associated with distinct signatures of proteins. Moreover, two major branches separated cortex, striatum and hippocampus from cerebellum and hypothalamus, suggesting that hindbrain and basal forebrain connections have broadly distinct molecular properties compared with connections of other forebrain structures. We found that, in the hippocampus, Dlg3, PSD93 and PSD95 but not Dlg1 were highly correlated (R 2 = 0.7-0.8) with projection volume (Table S6). We then asked if the biochemical pathways that underlie brain connectivity were brain region specific, and performed functional enrichment on the synaptic proteins that were highly correlated (R 2 ≥ 0.6) with neuron projection volume for each region. Pathways associated with glutamatergic synapses, calcium signaling, long-term potentiation (LTP), long-term depression (LTD) and insulin signaling were over-represented in the hippocampus, striatum and cortical regions, whereas pathways involved in Parkinson's, Huntington's and oxidative phosphorylation and mitochondrial components were enriched in the cerebellum and hypothalamus. Additionally, the molecular correlates of connectivity in the hippocampus are uniquely enriched in endocytosis (q = 4.09 × 10 −9 ) and GABAergic synapses (q = 7.53 × 10 −3 ), the cortical regions in components of the TCA cycle (q = 1.72 × 10 −3 ) and the cerebellum in valine, leucine and isoleucine degradation pathways (q = 1.11 × 10 −6 ) and fatty acids metabolism (q = 4.6 × 10 −4 ) ( Table  S7A-E). Together, these findings indicate that synapse proteome composition may reflect functional differences between interconnected brain regions.

Regional Differences in Postsynaptic Protein Interaction Networks
The organization and function of synapse proteomes have been studied using protein-protein interaction (PPI) networks [18,22,52,53] and we used this approach to explore the organization of protein interaction networks in different brain regions. First, the total postsynaptic proteome network was built from the list of proteins obtained in this study and PPIs obtained by mining publicly available databases: BioGRID [38], IntAct [39] and DIP [40] for both mouse and human. The total network consists of 1016 proteins and 8105 PPIs. Using the differential abundance of proteins as an . Correlation between brain region-specific postsynaptic protein abundance and mesoscale connectome. Clustering heatmap of the correlation between protein abundance and neuron projection volume in each brain region. Color key shows Z-transformed correlation values; red corresponds to negative correlation and blue to positive correlation.
We found that, in the hippocampus, Dlg3, PSD93 and PSD95 but not Dlg1 were highly correlated (R 2 = 0.7-0.8) with projection volume (Table S6). We then asked if the biochemical pathways that underlie brain connectivity were brain region specific, and performed functional enrichment on the synaptic proteins that were highly correlated (R 2 ≥ 0.6) with neuron projection volume for each region. Pathways associated with glutamatergic synapses, calcium signaling, long-term potentiation (LTP), long-term depression (LTD) and insulin signaling were over-represented in the hippocampus, striatum and cortical regions, whereas pathways involved in Parkinson's, Huntington's and oxidative phosphorylation and mitochondrial components were enriched in the cerebellum and hypothalamus. Additionally, the molecular correlates of connectivity in the hippocampus are uniquely enriched in endocytosis (q = 4.09 × 10 −9 ) and GABAergic synapses (q = 7.53 × 10 −3 ), the cortical regions in components of the TCA cycle (q = 1.72 × 10 −3 ) and the cerebellum in valine, leucine and isoleucine degradation pathways (q = 1.11 × 10 −6 ) and fatty acids metabolism (q = 4.6 × 10 −4 ) (Table S7A-E). Together, these findings indicate that synapse proteome composition may reflect functional differences between interconnected brain regions.

Regional Differences in Postsynaptic Protein Interaction Networks
The organization and function of synapse proteomes have been studied using protein-protein interaction (PPI) networks [18,22,52,53] and we used this approach to explore the organization of protein interaction networks in different brain regions. First, the total postsynaptic proteome network was built from the list of proteins obtained in this study and PPIs obtained by mining publicly available databases: BioGRID [38], IntAct [39] and DIP [40] for both mouse and human. The total network consists of 1016 proteins and 8105 PPIs. Using the differential abundance of proteins as an edge weight, we constructed individual networks for each region, identified clusters and their corresponding enrichments in biological and disease functions (Table S8). We found that each region-specific PPI network was split by the same method into a different number of clusters (cl. N), ranging from 82 to 112 clusters (results for a spectral clustering algorithm are shown in Table S8). We assessed the clustering structure of each region's PPIs for robustness and the resulting (consensus) clusters were examined for disease enrichment (Table S9).

Identifying a Stable Core Network
To define the network structures that are conserved across all brain regions, we identified 4205 binary interactions (52% of total) that were found in the same cluster in the majority of regional networks. We refer to this as the "stable" postsynaptic density (PSD) (1016 proteins, Figure 5 and Table S10). Spectral clustering generated 73 clusters in total, where the nine largest represent crucial synaptic proteins and neural housekeeping functions: cl. 37 corresponds to the postsynaptic signaling complexes composed of MAGUK scaffold proteins, AMPA and NMDA receptors [23], and other clusters contain ribosomal, metabolic enzymes and actin/myosin-associated proteins ( Figure 5, Table S10). We compared the composition of these stable communities with previously detected PPMs and found significant overlaps. For example, cl. 1 containing ATPase and cytochrome-related proteins and cl. 8 containing mitochondrial complex I proteins were over-represented in protein module PPM1 associated with related terms (p = 4.5 × 10 −4 and p = 9.3 × 10 −5 , respectively); cl 4 is composed mainly of metabolic enzymes and falls almost entirely in to PPM1 associated with Parkinson's disease, Alzheimer's disease and metabolic pathways by KEGG pathway enrichment ( Figure 5; p = 7.5 × 10 −5 ). Molecules involved in memory and cognition (e.g., MAGUKs) that populated cl. 6 are distributed between PPM5 (p = 8.0 × 10 −3 ) and PPM4 (p = 1.9 × 10 −2 ) associated with the terms "neurotransmitter receptor binding and downstream transmission in the postsynaptic cell", "long-term potentiation" and "long-term depression".

Discussion
Using mass spectrometry we have examined the protein composition of the postsynaptic proteome of excitatory synapses from regions of the mouse brain and generated a freely available data resource (Edinburgh DataShare, http://dx.doi.org/10.7488/ds/1713). We found that a high percentage of proteins show abundant differences between brain regions. The postsynaptic proteome composition for each brain region forms a distinctive molecular signature. Because these proteomic data were obtained from tissue samples composed of many individual synapses, the proteomic signatures indicate that there might be synapse diversity at the single-synapse level. Consistent with this, we have recently examined the differential distribution of PSD95 and SAP102 in individual synapses across the whole mouse brain and found that these two proteins are differentially Figure 5. Postsynaptic proteome interaction network showing the cluster structure for the "stable network". A few large clusters with specific functionally related proteins could be detected: cl. 3 contains ribosomal proteins (red cluster at the bottom), cl. 4 contains metabolic enzymes (light green on the left), cl. 7 is enriched with actin-, myosin-and cytoskeleton remodeling-associated proteins (dark blue near the top), cl. 8 contains NADH-oxidoreductases (dark-red cluster on the right) and cl. 1 contains ATPases and voltage-dependent anion channels (light blue on the right). cl. 6 corresponds to key proteins involved in synaptic transmission and plasticity, including AMPA, NMDA receptors and MAGUK proteins (orange cluster at the top). Networks were visualized with Gephi.

Discussion
Using mass spectrometry we have examined the protein composition of the postsynaptic proteome of excitatory synapses from regions of the mouse brain and generated a freely available data resource (Edinburgh DataShare, http://dx.doi.org/10.7488/ds/2399). We found that a high percentage of proteins show abundant differences between brain regions. The postsynaptic proteome composition for each brain region forms a distinctive molecular signature. Because these proteomic data were obtained from tissue samples composed of many individual synapses, the proteomic signatures indicate that there might be synapse diversity at the single-synapse level. Consistent with this, we have recently examined the differential distribution of PSD95 and SAP102 in individual synapses across the whole mouse brain and found that these two proteins are differentially distributed into synapse subtypes [26]. Moreover, each brain region was composed of varying proportions of synapse subtypes, which results in each region having a "signature of subtypes". Together, these findings indicate that postsynaptic proteome diversity seen at the level of brain regions arises at the individual synapse level.
The advantage of proteomic mass spectrometry is that it examines the expression of large numbers of proteins and can therefore shed light on how sets of proteins are expressed. We found regional diversity in sets of proteins known to be associated with biochemical pathways controlling physiological processes (such as forms of synaptic plasticity and cognition) and diseases. We also found evidence that scaffold proteins involved with the supramolecular assembly of complexes and supercomplexes [20,23] were differentially distributed across the brain.
The analysis of weighted PPI networks supports previous findings that the postsynaptic proteome has a modular structure [52]. We now find that the regional variability of protein complex composition strongly depends on the relative protein abundance, thereby providing the heterogeneity and unique biochemical signaling potential of each region. In each brain region, we find that~60% of the complexes/clusters are conserved in a stable network and that~40% underpin regional specifications. Disease enrichment analysis, which was performed at the regional level, tends to identify the same clusters. Regional specificity results in effectively the same cluster being more or less enriched for each disease across the different brain regions.
The coordinate expression of sets and modules of proteins suggests the possibility that there is an underlying genetic mechanism coordinating the spatial expression of synapse proteins. Evidence in support of a coordinating genetic mechanism acting in the temporal domain was provided from transcriptome analyses of developing cultured neurons [54] and lifespan expression data in the brain of mouse and human [2], which show concerted regulation of postsynaptic proteins. Further evidence for an underlying genetic program regulating the differential spatial expression comes from the hierarchical clustering of protein expression that reveals a correlation with the early development of the nervous system. A similar result was obtained using the single-synapse resolution mapping: the regional signatures of synapse composition in the adult mouse brain are organized into three major groups corresponding to the earliest division of the neural tube [55]. The clustering of the hippocampus, a subcortical region, with cortical regions likely reflects their common developmental origins [55]. These multiple lines of evidence suggest that there is temporal and spatial regulation of postsynaptic proteome expression and that it produces diversity of synapse types [26].
The purpose of regional diversity is most likely to subserve region-specific physiological and behavioral functions. It might also be to provide regional specializations within the greater systems-level organization or global circuitry of the brain. Our analysis of connectome data indicates that the anatomical circuitry of the brain links areas with distinct synapse proteome compositions. Given the differences in biochemical pathways, this indicates that functional specializations in regions are integrated by the connectome. This is consistent with our recent findings showing that the regional composition of the neocortex is linked to behavioral functions observed using functional magnetic resonance imaging (fMRI) [25].
It is interesting that hypothalamus and striatal regions cluster together in protein abundance, whereas the cortex, striatum and hippocampus were separate from the cerebellum and hypothalamus when abundance and connection strength was analyzed. The protein abundance and connection strength correlation between the hypothalamus and cerebellum might relate to the involvement of these structures in several shared biological functions. For example, the cerebellum is known to be interconnected with the hypothalamus, an important center for feeding control, through direct bidirectional connections [56][57][58]. Most of the hypothalamic neurons receiving cerebellar projections are feeding related. The reciprocal connections between the cerebellum and hypothalamus might therefore play an important role in feeding motivation and in the regulation of feeding behavior [59].
Numerous forms of dementia show pathology in different regions of the brain. For example, dementia with Lewy bodies, which is the second most common form of neurodegenerative dementia after Alzheimer's disease [60], shows hypothalamic atrophy, potentially relevant to the enrichment of dementia pathways observed in this region, whereas this region is not affected in Alzheimer's disease patients [61]. Although the severe pathology of Huntington's disease is mostly related to the striatum, important changes in the hypothalamus and cerebellum have also been described [62], consistent with the observed enrichment of their respective disease pathways in these areas. Homeostatic control of emotions and of metabolism are disturbed early in Huntington's disease and there are alterations in the peptide expression of hypothalamic neurons known to be involved in the regulation of metabolism and emotion [63]. In cerebellum, a dysfunction of the Purkinje cells might contribute to motor impairment in a murine model of Huntington's disease [64], and the cerebellum appears to be commonly affected in juvenile Huntington's disease, as shown by a decrease of cerebellar volume [65][66][67]. The enrichment of Parkinson's disease pathways in hypothalamus is interesting since alterations in hypothalamus have been linked to non-motor symptoms such as sleep disturbances [68]. Furthermore, the cerebellum is known to be involved in Parkinsonian disorders with motor symptoms [69].
In conclusion, we provide a new data resource describing the composition of the postsynaptic proteome in excitatory synapses from regions of the mouse brain. These data indicate that molecular compositional differences in synapses in different brain regions are relevant to a broad range of physiological and disease processes.
Supplementary Materials: The following are available online at http://www.mdpi.com/2227-7382/6/3/31/s1, Figure S1. The seven regions dissected from the mouse brain. Figure S2. The overlap between proteomes obtained in this study and previous studies. Figure S3. The principal component analysis (PCA) of the abundances of the 1173 synaptic proteins quantified across the seven integral regions for all six mouse brains. Figure S4. Circos plots showing differential abundance of the six postsynaptic proteome modules (PPMs) across mouse brain regions. Table S1. LC-MS/MS quantitation and comparison of the mouse synaptic proteome across the seven integral regions of the mouse brain by one-way ANOVA. Table S2. New PSD proteins detected in this study compared to all other published mouse PSD studies. Table S3. Differential stability analysis and functional enrichment of synaptic proteins. Table S4. Summary of the 868 synaptic proteins found to be differentially abundant across the seven integral regions of the mouse brain. Table S5. Module gene list and abundances for all six modules. Table S6. Correlation of ABI functional connectivity with synaptic proteome abundances measured in this study. Table S7. Functional Enrichment of ABI and Roy Correlates by Brain region. Table S8. PSD protein interaction network analysis across the mouse brain. Table S9. Disease associated enrichment in the interactomes of the mouse PSD. Table S10. The "stable network": postsynaptic proteome protein-protein interaction networks found to be enriched in at least four different regions of the mouse brain.