Insulin stimulated MCF7 breast cancer cells: Proteome dataset

The proteome data provided in this article were acquired from MCF7 breast cancer cells stimulated with insulin, and were generated by using a 2D-SCX (strong cation exchange)/RPLC (reversed phase liquid chromatography) separation protocol followed by tandem mass spectrometry (MS) detection. To facilitate data re-processing by more advanced search engines and the extraction of additional information from already existing files, both raw and processed data are provided. The sample preparation, data acquisition and processing protocols are described in detail. The raw data relate to work published in “Proteome profile of the MCF7 cancer cell line: a mass spectrometric evaluation” (Sarvaiya et al., 2006) [1] and are made available through the PRIDE (PRoteomics IDEntifications)/ProteomeXchange public repository with identifier PRIDE: PXD004051 (“2016 update of the PRIDE database and tools” (Vizcaino et al., 2016) [2]).


a b s t r a c t
The proteome data provided in this article were acquired from MCF7 breast cancer cells stimulated with insulin, and were generated by using a 2D-SCX (strong cation exchange)/RPLC (reversed phase liquid chromatography) separation protocol followed by tandem mass spectrometry (MS) detection. To facilitate data reprocessing by more advanced search engines and the extraction of additional information from already existing files, both raw and processed data are provided. The sample preparation, data acquisition and processing protocols are described in detail. The raw data relate to work published in "Proteome profile of the MCF7 cancer cell line: a mass spectrometric evaluation" (Sarvaiya et al., 2006) [1] and are made available through the PRIDE (PRoteomics IDEntifications)/ProteomeXchange public repository with identifier PRIDE: PXD004051 ("2016 update of the PRIDE database and tools" (Vizcaino et al., 2016) [2]

Value of the data
The data provided in this manuscript describe the proteome profile of insulin-stimulated MCF7 breast cancer cells.
The MS RAW files can be used to verify the fragmentation pattern of 1þ , 2þ and 3 þ non-labeled peptide ions in a linear ion trap analyzer, to experimentally confirm computational predictions, and to select precursor-fragment transitions for MRM method development.
The MS RAW files can be re-processed with more advanced (or, a combination) of search engines, to enable the identification of additional peptides and proteins, and to confirm the expression of certain proteins under insulin stimulation conditions.
The biological processes, functional categories and signaling pathways that were identified in these cells can be used as a reference for comparison with other cell stimulation conditions, or for validating data generated in other laboratories and support the identification of putative drug targets or biomarkers.

Data
The MCF7 proteome data described in this manuscript include: (a) mass spectrometry RAW files deposited in PRIDE; (b) processed RAW files with the Thermo Electron Discoverer 1.4 software; (c) processed data with the DAVID (Database for Annotation, Visualization and Integrated Discovery [3,4]) software package; and (d) processed data with the Cytoscape visualization tool set [5]. Figs. 1-4 provide the sample preparation protocol, representative base-peak chromatograms for the 16 SCX peptide fractions, the KEGG (Kyoto Encyclopedia of Genes and Genomes [6]) insulin signaling    Table 5 provides the list of interactions associated with the 34 insulin signaling proteins generated with STRING (Search Tool for the Retrieval of Interacting Genes/Proteins [7]).

Cell culture
MCF7 breast cancer cells were cultured in EMEM with FBS (10%) and insulin (10 mg/mL) at 37°C in an incubator with 5% CO 2 . At 70-80% confluence the cells were detached by trypsinization, harvested and stored in a freezer at À 80°C.

Cell processing
The cells were lysed by rocking with RIPA buffer supplemented with protease and phosphatase inhibitors for 2 h at 4°C. The final composition of the lysis solution was: 1 mL RIPA buffer, 100 mL protease inhibitor cocktail (104 mM AEBSF, 0.08 mM aprotinin, 2 mM leupeptin, 4 mM bestatin, 1.5 mM pepstatin A, 1.4 mM E-64), 100 mL NaF (100 mM), 50 mL Na 3 VO 4 (200 mM) and 8.75 mL of ice cold water [1]. The cells were centrifuged at 15,000g (15 min, 4°C), and the protein concentration in the supernatant was measured with the Bradford assay. The cell extract (1 mL containing 3 mg of proteins) was reduced with DTT (4.5 mM) in the presence of urea (8 M) for 1 h at 60°C. The protein solution was then diluted 10 fold with NH 4 HCO 3 (50 mM) and subjected to enzymatic digestion with trypsin at a protein:enzyme ratio of 50:1 w/w, overnight at 37°C. The digestion reaction was quenched with 10 mL TFA/mL protein digest solution, and 300 mg of the protein digest was desalted with SPEC-PTC18 solid phase extraction tips. The sample was concentrated to a final concentration of 4 mg/mL with a vacuum centrifuge, and stored at À 20°C until further analysis by shotgun 2D-SCX/ LC-ESI-MS/MS.

Mass spectrometry
An LTQ linear ion trap mass spectrometer (Thermo Electron) was used for detection. Data acquisition occurred in data-dependent mode using 1 MS scan (5 microscans averaged) followed by 1 zoom scan (5 microscans averaged) and 1 MS 2 on the top 5 most intense peaks. The zoom scan window was 75 m/z. Dynamic exclusion parameters were set at repeat count 1, repeat duration 30 s, exclusion list size 200, exclusion duration 60 s and exclusion mass width 71.5 m/z. Precursor ion fragmentation occurred by setting the collision induced dissociation (CID) parameters at isolation width of 3m/z, normalized collision energy 35%, activation Q 0.25 and activation time 30 ms.

Data processing
Raw data were analyzed with the Discoverer 1.4 software package (Thermo Electron) by using a Homo sapiens database with 20,199 entries downloaded from UniProt (January 2015). The database search parameters included: chemical and posttranslational modifications were not allowed, minimum and maximum peptide length was 6 and 144 amino acids, respectively, only fully tryptic fragments were considered for peptide matching, the number of allowed missed cleavage sites was 2, the precursor ion tolerance was 2 amu, the fragment ion tolerance was 1 amu, and the relaxed and strict false discovery rates (FDRs) were set at 3% and 1%, respectively. The quality of the data at the peptide level is verifiable from multiple tandem MS hits/peptide. The reliability of protein identifications can be inferred from the number of unique peptide hits/protein and FDRs set per user's preference and choice of search engine. The list of identified proteins was uploaded in DAVID to identify the KEGG signaling pathways and to generate the GO and functional annotation charts. All results were filtered with an EASE score of 0.1 [8]. The proteins matched to Kegg insulin signaling (34) were uploaded in STRING to extract the known protein-protein interactions related to this set of proteins. This list of interactions was uploaded to Cytoscape 3.4.0 to visualize the network of interactions in a degree sorted circle layout.