A proteomic profiling dataset of recombinant Chinese hamster ovary cells showing enhanced cellular growth following miR-378 depletion

The proteomic data presented in this article provide supporting information to the related research article "Depletion of endogenous miRNA-378-3p increases peak cell density of CHO DP12 cells and is correlated with elevated levels of Ubiquitin Carboxyl-Terminal Hydrolase 14" (Costello et al., in press) [1]. Control and microRNA-378 depleted CHO DP12 cells were profiled using label-free quantitative proteomic profiling. CHO DP12 cells were collected on day 4 and 8 of batch culture, subcellular proteomic enrichment was performed, and subsequent fractions were analyzed by liquid chromatography tandem mass spectrometry (LC-MS/MS). Here we provide the complete proteomic dataset of proteins significantly differentially expressed by greater than 1.25-fold change in abundance between control and miR-378 depleted CHO DP12 cells, and the lists of all identified proteins for each condition.


Specifications
Value of the data This data reveals protein expression patterns associated with microRNA-378. Differentially expressed proteins between control and miR-378 depleted CHO cells may serve as indicators of CHO cell growth.
This dataset reports enriched proteins from the cytosolic and membrane subcellular fractions of CHO DP12 cells.
This data provides proteomic profiles for two time-points of CHO DP12 batch culture; exponential and stationary phase.

Data
The data presents a quantitative proteomic profiling of subcellular-enriched protein fractions from day 4 and day 8 cultures of CHO DP12 cells following microRNA-378 stable depletion. Both the cytosolic and membrane protein enriched fractions were analysed to identify significantly differentially expressed proteins between control and miR-378 depleted CHO cells (miR-378-spg) for each timepoint. Differentially expressed proteins between control and miR-378-spg cells are required to have (i) a p-value r 0.05 on the peptide and the protein level and (ii) a minimum of 1.25-fold change in normalized abundance levels.
Tables 1-4 list the differentially expressed proteins with an increased abundance in miR-378 depleted cells when compared to control cells. Proteins with an increased abundance in miR-378-spg cells represent potential direct targets of miR-378 in CHO cells and are of most interest. Tables 1-4 report the accession number, peptide count, number of unique peptides, ANOVA p-value, q-value, maximum fold-change and protein name. Supplementary Table S1 presents the complete list of all differentially overexpressed and under expressed proteins for each subcellular fraction and timepoint. Supplementary Table S2 presents the qualitative list of all identified proteins for each condition (control and miR-378-spg), subcellular enriched fraction (cytosolic and membrane protein enriched) and time-point (day 4 and day 8 of culture). Heat maps are shown in Fig. 1 that outlines the clustering of significantly increased versus decreased proteins in miR-378-spg cells, as compared to control cells.

Subcellular protein extraction and in-solution protein digestion
Triplicate biological samples for control and miR-378 depleted cells were collected on day 4 and day 8 of batch cultures. Subcellular protein enrichment was achieved using the Mem-Per Plus Membrane protein extraction kit (#89842, Thermo Fisher Scientific) which yielded a cytosolic and membrane protein enriched fraction. Protein concentration was determined using the QuickStart Bradford assay (Bio-rad). Equal concentrations (100 mg) of protein from each sample were purified and trypsin digested for mass spectrometry using the filter-aided sample preparation method as previously described [2]. The resulting peptide samples were purified using Pierce C18 spin columns then dried using vacuum centrifugation and suspended in 2% acetonitrile and 0.1% trifluoracetic acid in LC grade water prior to LC-MS/MS analysis.

Label-free liquid chromatography mass spectrometry
Quantitative label-free liquid-chromatography mass spectrometry (LC-MS/MS) analysis of mir-378-spg and NC-spg membrane and cytosolic fractions from day 4 and day 8 was carried out using a Dionex UltiMate™ 3000 RSLCnano system (Thermo Fisher Scientific) coupled to a hybrid linear LTQ Orbitrap XL mass spectrometer (Thermo Fisher Scientific). LC-MS/MS methods were applied as previously described [3]. A 5 μL injection of each sample was loaded onto a C18 trapping column (PepMap100, C18, 300 μm Â 5 mm; Thermo Fisher Scientific). Each sample was desalted for 5 min using a flow rate of 25 μL/min with 2% ACN, 0.1% TFA before being switched online with the analytical column (PepMap C18, 75 μm ID Â 250 mm, 3 μm particle and 100 Å pore size; (Thermo Fisher Scientific)). Peptides were eluted using a binary gradient of Solvent A (2% ACN and 0.1% formic acid in LC grade water) and Solvent B (80% ACN and 0.08% formic acid in LC grade water). The following gradient was applied; 6-25% solvent B for 120 min and 25-50% solvent B in a further 60 min at a column flow rate of 300 nL/min. Data was acquired with Xcalibur software, version 2.0.7 (Thermo Fisher Scientific). The LTQ Orbitrap XL was operated in data-dependent mode with full MS scans in the 400-1200 m/z range using the Orbitrap mass analyser with a resolution of 30,000 (at m/z 400). Up to three of the most intense ions ( þ1, þ2, and þ 3) per scan were fragmented using collisioninduced dissociation (CID) in the linear ion trap. Dynamic exclusion was enabled with a repeat count of 1, repeat duration of 20 s, and exclusion duration of 40 s. All tandem mass spectra were collected using a normalized collision energy of 32%, and an isolation window of 2 m/z with an activation time of 30 ms.

Quantitative label-free LC-MS/MS data analysis
Protein identification was achieved using Proteome Discoverer 2.1 with the Sequest HT and MASCOT algorithm followed by Percolator validation [4] to apply a false-discovery rate o 0.01. Data was searched against the NCBI Chinese Hamster (Cricetulus griseus) protein database containing 44,065 sequences (fasta file downloaded November 2015). The following search parameters were used for protein identification: (1) precursor mass tolerance set to 20 ppm, (2) fragment mass tolerance set to 0.6 Da, (3) up to two missed cleavages were allowed, (4) carbamidomethylation of cysteine set as a static modification and (5) methionine oxidation set as a dynamic modification. The complete lists of all identified proteins from the cytosolic and membrane enriched fractions of day 4 and day 8 cell cultures of the control (NC378-spg) and miR-378-spg are provided in Supplementary  Table S2.
Quantitative label-free data analysis was performed using Progenesis QI for Proteomics (version 2.0; Nonlinear Dynamics, a Waters company) as described by the manufacturer (www.non linear.com). To counteract potential drifts in retention time a reference run was assigned to which all MS data files were aligned. The triplicate samples from the two experimental groups (NC-378-spg and miR-378-spg) were set up for differential analysis and label-free relative quantitation was carried out after peak detection, automatic retention time calibration and normalisation to account for Table 1 Mass spectrometric identification of 28 proteins from the cytosolic enriched protein fraction with Z 1.25-fold increase in the miR-378 depleted CHO cells on day 4 of cell culture.     The following settings were applied to filter peptide features (1) peptide features with a one-way ANOVA p-value o 0.05 between experimental groups, (2) mass peaks with charge states from þ1 to þ 3 and (3) greater than one isotope per peptide. The normalised data is transformed prior to statistical analysis, using an arcsinh transformation to meet the assumptions of the one-way ANOVA test. A mascot generic file (mgf) was generated from all exported MS/MS spectra which satisfied the peptide filters, the mgf was used for peptide and protein identification in Proteome Discoverer. Protein identifications were imported into Progenesis and considered differentially expressed if they passed the following criteria: (i) a protein one-way ANOVA p-value o0.05 and (ii) a Z1.25-fold change in relative abundance between the two experimental groups. All differentially expressed proteins identified between NC378-spg and miR-378-spg cells are reported in Supplementary Table S1. Heatmaps illustrating protein abundances for statistically significant and differentially expressed proteins were designed using ggplot2 in R-studio. The normalised abundance values of differentially expressed proteins were determined using Progenesis QI for Proteomics and were loaded as a txt file into R-studio and the data was log2 transformed. Hierarchical Pearson clustering was then performed on Z-score normalised intensity values by clustering both samples and proteins.