Spatial-proteomics reveals phospho-signaling dynamics at subcellular resolution

Dynamic change in subcellular localization of signaling proteins is a general concept that eukaryotic cells evolved for eliciting a coordinated response to stimuli. Mass spectrometry-based proteomics in combination with subcellular fractionation can provide comprehensive maps of spatio-temporal regulation of protein networks in cells, but involves laborious workflows that does not cover the phospho-proteome level. Here we present a high-throughput workflow based on sequential cell fractionation to profile the global proteome and phospho-proteome dynamics across six distinct subcellular fractions. We benchmark the workflow by studying spatio-temporal EGFR phospho-signaling dynamics in vitro in HeLa cells and in vivo in mouse tissues. Finally, we investigate the spatio-temporal stress signaling, revealing cellular relocation of ribosomal proteins in response to hypertonicity and muscle contraction. Proteomics data generated in this study can be explored through https://SpatialProteoDynamics.github.io.

(E) F-score barplots for the protein assignment to organelles in the present study (blue) and different subcellular fractionation published studies (yellow and red).
(F) F-score barplots for the phosphosite assignment to organelles in the present study (green -HeLa and red -Liver) and Krahmer et al 6 (blue).

Supplementary Note 1: assignment of proteins to dual or multiple locations.
In the main manuscript text, we state that 82% of the proteins were reproducibly identified in two or more of the six subcellular fractions. It has been published that a significant part of the proteome is not restricted to only one subcellular location 1 . However, it is very important to differentiate between identification in a subcellular fraction and actual colocalization of a protein in a subcellular niche. Merely identification cannot provide accurate information of the subcellular niche of the protein. In contrast, that information should be derived from the quantification of a protein's abundance across the six fractions, i.e. the relative enrichment of the protein in each fraction. Throughout the manuscript, we use the relative enrichment in each fraction to extrapolate the subcellular niches, assigning the fraction with the highest intensity as the main subcellular location for a protein.
When trying to accurately assess dual or multiple locations of proteins, our approach is limited by the fact this it is comprising only six fractions, which can group several subcellular niches. Methods that provide higher resolution, such as hyperLOPIT 3 ,LOPIT-DC 4 or SubCellBarCode 5 , offer better insights in this regard. In fact, some bioinformatics tools have been developed to assess simultaneous protein sub-cellular localization in those datasets, such as the one described by Crook et al 7 . However, as already mentioned, the predictive outcome of this tool is better suited for subcellular approaches with more fractions analyzed.
Nevertheless, although with certain limitations due to the purification of only six subcellular compartments, our approach can also identify proteins that are present in multiple compartments simultaneously. In fact, we demonstrated in the main text the dual, and also dynamic, location of EGFR-adaptor proteins SHC1, GRB2 and CBL, which were all found in both the cytosol and the membrane-associated compartment (Main Figure 4C). However, to assess if dual or multiple locating proteins are captured by our experimental approach, we investigated some proteins known to have dual localization according to the antibody-based fluorescent image analysis described in the publication by Thul et al 1 . As an example of proteins with dual/multiple location, Thul et al described CCAR1 and NDUFA9, which they found in both the nucleus and the Golgi apparatus or mitochondria, respectively. For both CCAR1 and NDUFA9, we find that in our dataset is in line with the observations by Thul et al as the majority of each protein is in FR5 (nucleoplasm) and FR4 (mitochondrion), respectively. Moreover, we can see some contribution of CCAR1 in FR4 (enriched in Golgi proteins), which is especially clear in

Data sets
The Datasets sheet in MetaMass II contains normalized data from indicated studies. Normalization to a fixed max value is recommended for better visualization of data in heatmaps. Columns TD:TU can be used to filter the overlap between individual studies.

K-means clustering 1
Results from the two studies were saved as separate tab-delimited text files for processing in Cluster 3.0. Both contain the first column with protein identifiers. 5 K-means clustering 2 The Tab-delimited text files are opened in Cluster. 3.0, data and formatted as indicated above. With 875 groups for k-means clustering, the groups will contain an average of five proteins. Larger groups will yield higher coverage and lower precision for assigning subcellular locations (see later).
The kgg output files from Cluster 3.0 contain protein identifiers and group assignment. The lists are copied into the «Groups» worksheet in the Analysis Workbook 7 The output from Cluster 3.0 is pasted into the «Groups» worksheet in the Analysis Workbook The clustering result for a given dataset is pasted into cell A1 in the Data input sheet in MetaMass Click a button to select a marker set. Click «Copy F-scores» and paste into the corresponding column in the F-score worksheet in the Analysis Workbook (see next page The algorithm counts markers for all subcellular locations within all groups. All proteins in the same group are assigned to the location with the highest marker count. Thus, if there are three markers for the nucleus and two for the cytosol, all are assigned to the nucleus. The purity is, however only 0.6; 3/(3+2), and the total number of markers in the group is 3. Higher purity and higher marker counts indicate higher precision for the assigned locations.
The classification is based on standard Excel functions. Click yellow cells to inspect them. 16 9 Annotations: The annotations sheet in MetaMass contains annotations on the subcellular location of proteins from indicated sources. The marker sets were generated by filtering on single locations in full annotations. 17 MetaMass also classifies the assigned locations on basis of their fit with annotations in the Compartments Database. (https://compartments.jensenlab.org/Search, https://doi.org/10.1093/database/bau012 ) Annotations in the Compartments Database are not based on mass spectrometry data and therefore serve as an independent reference. The spreadsheet returns the percentage of assigned proteins with a Compartments Database score higher than 4 (max= 5).
Scores from the Compartments database serve as an independent reference 18