Dataset for the proteomic inventory and quantitative analysis of the breast cancer hypoxic secretome associated with osteotropism

The cancer secretome includes all of the macromolecules secreted by cells into their microenvironment. Cancer cell secretomes are significantly different to that of normal cells reflecting the changes that normal cells have undergone during their transition to malignancy. More importantly, cancer secretomes are known to be active mediators of both local and distant host cells and play an important role in the progression and dissemination of cancer. Here we have quantitatively profiled both the composition of breast cancer secretomes associated with osteotropism, and their modulation under normoxic and hypoxic conditions. We detect and quantify 162 secretome proteins across all conditions which show differential hypoxic induction and association with osteotropism. Mass Spectrometry proteomics data have been deposited to the ProteomeXchange Consortium with the dataset identifier PXD000397 and the complete proteomic, bioinformatic and biological analyses are reported in Cox et al. (2015) [1].


a b s t r a c t
The cancer secretome includes all of the macromolecules secreted by cells into their microenvironment. Cancer cell secretomes are significantly different to that of normal cells reflecting the changes that normal cells have undergone during their transition to malignancy. More importantly, cancer secretomes are known to be active mediators of both local and distant host cells and play an important role in the progression and dissemination of cancer. Here we have quantitatively profiled both the composition of breast cancer secretomes associated with osteotropism, and their modulation under normoxic and hypoxic conditions. We detect and quantify 162 secretome proteins across all conditions which show differential hypoxic induction and association with osteotropism. Mass Spectrometry proteomics data have been deposited to the ProteomeXchange Consortium with the dataset identifier PXD000397 and the complete proteomic, bioinformatic and biological analyses are reported in Cox

Value of the data
This data set will be of value to the scientific community wanting to determine which proteins are secreted from breast cancer cells that metastasize to bone, and those that are regulated by hypoxia.
The dataset includes quantitative global proteome (Label-free and SILAC) analysis of hypoxic induced secretomes of parental and bone tropic human breast cancer cell lines.
Differential analysis of secretome changes associated with bone tropism in breast cancer. Response of the cancer secretome to hypoxic induction.
1. Data, experimental design, materials and methods

Cell lines
The MDA-MB-231 Bone Tropic (BT) 1833 subclone cell line was obtained from J. Massagué at the Memorial Sloan-Kettering Cancer Center. The MDA-MB-231 parental cell line was obtained from the American Type Culture Collection (ATCC) (distributed by LGC Standards). All cell lines were routinely tested for mycoplasma and tested negative for murine pathogens by IMPACT I testing (IDEXX Laboratories). grown to 70% confluence, washed with PBS and the media changed to serum-free DMEM and incubated at either 21% oxygen, 5% CO 2, 80% humidity (normoxia) or 1% oxygen, 5% CO 2, 80% humidity (hypoxia) for 24 h in a Whitley H35 Hypoxystation (Don Whitley Scientific Ltd.). After 24 h, the conditioned media was collected and passed through a 0.22 μM filter to remove floating cells and cellular debris.

Collection of hypoxia induced conditioned medium (CM)
For SILAC-based Mass Spectrometry, the MDA-MB-231 parental (wt) and MDA-MB-231 Bone Tropic (BT) cell lines were both grown in light and heavy Stable Isotope Labelling of Amino Acids in Culture (SILAC) DMEM with 100 U/ml penicillin and 100 μg/ml streptomycin, plus 10% dialysed Fetal Bovine Serum (FBS). DMEM (Caisson Labs) lacking arginine and lysine was supplemented with 70 μg/mL light isotope 12 C-, 14 N-arginine and 140 μg/mL light isotope 12 C-, 14 N-lysine (Sigma) (R0/K0) or 70 μg/mL heavy isotope 13 C-, 15 N-arginine and 140 μg/mL heavy isotope 13 C-, 15 N-lysine (Cambridge Isotope Laboratories) (R10/K8). Cells were grown for 5 passages and label incorporation was assessed through quantification of the number of labelled vs. unlabelled peptides. A minimum of 97-98% labelled arginine and lysine incorporation, with less than 1% proline conversion, was required for subsequent proteomics studies. For conditioned media (CM) collection, cells were grown to 70% confluence, washed with PBS and the media changed to serum-free DMEM and incubated at either 21% oxygen, 5% CO 2, 80% humidity (normoxia) or 1% oxygen, 5% CO 2, 80% humidity (hypoxia) for 24 h in a Whitley H35 Hypoxystation (Don Whitley Scientific Ltd.). After 24 h, the conditioned media was collected and passed though a 0.22 μM filter to remove floating cells and cellular debris.

Preparation of CMs for Mass Spectrometry
After collection, label-free and SILAC-labelled CMs were reduced in volume using Vivaspin 6 10,000 MWCO, PES Membrane centrifugal concentrators (Sartorius). The remaining protein was fully dissolved in 6 M urea, 2 M thiourea and 10 mM HEPES pH 8, after which exact protein amounts were determined using a Bradford assay. In SILAC-labelled repeats, the two SILAC labels (R10/K8 and R0/K0) were mixed 1:1. In label-free repeats, the samples were left unmixed, but equal amounts of starting material were used for processing. Proteins were reduced in 1 mM DTT (Sigma) for 45 min at room temperature (21°C), alkylated for 45 min using 5.5 mM chloroacetamide (Sigma), and digested with 1:50 (enzyme:protein ratio) of Mass Spectrometry (MS)-grade trypsin (Sigma) overnight at 37°C. Peptides were acidified with trifluoroacetic acid at a final concentration of 2%, after which they were desalted on in-house packed C18 StageTips as previously described [4]. Briefly, 2 discs of C18 material (3 M Empore) were packed into a 200 μl pipette tip, and activated with 20 μl of Methanol (HPLC grade) and 20 μl of 80% Acetonitrile, 0.1% FA. The C18 material was equilibrated with 2 Â 20 μl of 1% TFA, 3% Acetonitrile, after which 5 μg of each sample was loaded onto individual StageTips. After washing with 2 Â 20 μl of 0.1% FA, peptides were eluted with 2 Â 40 μl 80% Acetonitrile, 0.1% FA, and concentrated to 5 μl in an Eppendorf Speedvac. This final concentrate was acidified with 4 μl 1% TFA, 2% Acetonitrile for Mass Spectrometry analysis.

Mass Spectrometry acquisition and analysis
5 μg of peptides were loaded onto a 50 cm C18 reverse-phase analytical column (Thermo Easy-Spray ES803) using 100% Buffer A (0.1% Formic acid in water) at 720 bar, using the Thermo EasyLC 1000 uHPLC system in a single-column setup and the column oven operating at 45°C. Peptides were eluted over a 4 hour gradient ranging from 6 to 60% of 80% acetonitrile, 0.1% formic acid, and the Q-Exactive (Thermo Fisher Scientific) was run in a DD-MS2 top10 method. Full MS spectra were collected at a resolution of 70,000, with an AGC target of 3 Â 10 6 or maximum injection time of 20 ms and a scan range of 300-1750 m/z. The MS2 spectra were obtained at a resolution of 17,500, with an AGC target value of 1 Â 10 6 or maximum injection time of 60 ms, a normalised collision energy of 25 and an underfill ratio of 0.1%. Dynamic exclusion was set to 45 s, and ions with a charge state o 2 or unknown were excluded. MS performance was verified for consistency by running complex cell lysate quality control standards, and chromatography was monitored to check for reproducibility.

Data processing protocol
Raw data were processed using MaxQuant version 1.5.0.0 [5] (using the human Ensembl GRCh38 database) and Perseus version 1.4. Results were analysed using scripts written in-house in Python and R (deposited in the ProteomeXchange repository alongside data), and tested for statistical significance using the quantile function in the R statistical framework. To ensure high confidence identifications  Table 1. and quantification, a MaxQuant score of 450 and a minimum of two unique peptides per protein seen by tandem MS in all repeats were required. Initial analysis was undertaken using a label-free approach (two repeats) for global pairwise analyses, and data subsequently validated in a standardand reverse-label SILAC approach (two repeats). Identified intracellular contaminants were removed and secreted proteins retained by using the cellular compartment annotations in Ensembl and Pan-therDB, and Gene Ontology annotation enrichment for extracellular-associated terms. Overlap between global secretome Madd Spectrometry repeats in are shown in Fig. 1a for each condition. Log 2 expression data of extracellular secreted proteins in pairwise analysis is shown in Fig. 1b. A full list of detected proteins and their quantitative expression is available in supplementary Table 1. The complete proteomic, bioinformatic and biological analyses are reported in Cox et al. (2015) [1].

Conflict of interest
The authors declare no conflict of interest.