Dataset of Rat and Human Serum Proteomes Derived from Differential Depletion Strategies prior to Mass Spectrometry

This article provides information regarding the effect of four common high abundant protein (albumin and immunoglobulins (Ig)) depletion strategies upon serum proteomics datasets derived from normal, non-diseased rat or human serum. After tryptic digest, peptides were separated using C18 reverse phase liquid chromatography-tandem mass spectrometry (rpLC-MS/MS). Peptide spectral matching (PSM) and database searching was conducted using MS Amanda 2.0 and Sequest HT. Peptide and protein false discovery rates (FDR) were set at 0.01%, with at least two peptides assigned per protein. Protein quantitation and the extent of albumin and Ig removal was defined by PSM counts. Venn diagram analysis of the core proteomes, derived from proteins identified by both search engines, was performed using Venny. Ontological characterization and gene set enrichment were performed using WebGestalt. The dataset resulting from each depletion column is provided.


Value of the data
• The data comprises workflow for proteomic analysis of depleted human and rat serum samples generated by a wide selection of commercially available kits that might be a useful for other researchers to select the method of choice according to the target of interest. • The dataset includes comparison of protein content derived from spectral count data as defined by MS Amanda and Sequest HT search engines and peptide-spectrum match (PSM) output that might be a useful for other researchers for optimization of search engines and post processing approaches to maximize peptide and protein identification for high-resolution mass data. • The dataset includes serum proteomes for non-injured rats and healthy human subjects that might be a useful for other researchers for baseline or control dataset, reflective of a normal or healthy conditions, for which discovery of putative biomarkers may be compared.

Data
The work flow for sample preparation, data collection, processing and analysis are indicated ( Fig. 1 ) for four commercially available depletion columns (Supplementary Table 1).
The total number of PSMs and extent of albumin ( Table 1 ) or Ig ( Table 2 ) removal was determined in depleted human or rat serum samples. Serum proteins that were detected for each condition are displayed with for human (Supplementary Tables 2 A-E) and rat serum (Supplementary Tables 3 A-E). The accession number, protein name, description, PSMs, and q-values, as well as information regarding the number of total or unique peptides are indicated for each proteome as derived from each search engine, Sequest HT or MS Amanda.
The number of proteins from each depletion column and search engine were compared using Venn diagram analysis ( Fig. 2 ). Next, the concordance among the human and rat serum proteomes was compared. The serum proteome from each species was derived from column that led to greatest number of unique, non-redundant, protein identifications ( Fig. 3 ). Lastly, the core proteome was defined by the list of proteins identified in both Sequest and MS Amanda for each species and column used. This protein list was then used to define the extent to which depletion strategies affected overall characterization of the serum proteome defined by WEB-based Gene SeT AnaLysis Toolkit (WebGestalt) based on biological function ( Fig. 4 A-B).

Serum Preparation and Depletion
Pooled serum from either healthy males, aged 20-30 (n = 10) or male Sprague Dawley rats (n = 10) (BioIVT, Baltimore, MD) were thawed on ice, centrifuged at 1,500 x g for 10 min at 4 °C, then split into aliquots and stored at -80 °C until biochemical analysis. For column-based depletion, serum samples were thawed on ice then filtered using 0.45 μm cellulose acetate microspin columns (Sigma-Aldrich Inc., St. Louis, MO, USA). Twelve microliters of sera was used for Top 12 TM Abundant Protein Depletion Spin Columns (Pierce Biotechnology, Rockford, lL, USA), and 25uL each was used for depletion of sera with either PureProteome TM Albumin/IgG Magnetic Beads (Millipore Sigma, Burlington, MA, USA), AlbuSorb TM PLUS (Biotech Support Group, Monmouth Junction, NJ, USA), or Seppro® Rat Spin Columns (Sigma-Aldrich Inc., Saint. Louis, MO, USA). All columns are reported to remove albumin and IgGs, but the Top 12 TM Abundant Protein Depletion Column also removes ten additional proteins, including Alpha-1-Acid glycoprotein, Alpha-1-Antitrypsin, Alpha-2-Macroglobulin, Apolipoproteins A-I and A-II, Fibrinogen, Haptoglobin, IgA, IgM, and Transferrin. The Seppro® Rat Spin Column removes five additional proteins, namely Alpha-1-Antitrypsin, Fibrinogen, Haptoglobin, IgM as well as Transferrin. Although this column is intended for the rat based on the manufacturer's instructions, it was is reported to have cross reactivity with high abundant proteins in human sera [1] . Depletion procedures were performed at room temperature (RT, 20-25 °C) according to manufacturer's instructions. Eluted, depleted serum was immediately stored at 4 o C until further analysis.

Protein Assays
Total protein content of depleted sera was determined using the microBCA protein assay according to the manufacturer's instructions (Pierce Biotechnology, Rockford, lL, USA).

Tryptic Digest
Depleted serum containing 50 μg of protein denatured and reduced with 8M urea supplemented with 1M DTT, shaking the sample at 500 RPM using a 37 o C heated shaker for 45 minutes. Samples were alkylated with iodoacetamide (IAA) (50mM final concentration, Sigma-Aldrich Inc., St. Louis, MO, USA) in the dark for 45 minutes. Samples were diluted to ≤ 1M urea then supplemented with 1 mL of 50 mM NH 4 HCO 3 , pH 8.0), and with 2-2.5 μL of 6N NaOH to adjust the pH to 8.5 -9.0. Samples were then digested by adding 2 μg of Trypsin Gold (Promega, Madison, WI, USA) for 16-18 hours, shaking at 500 RPM, at 37 o C. Digestion was terminated with 20 μL formic acid (Sigma-Aldrich Inc., St. Louis, MO, USA). The final pH was adjusted to 2.5 -3.5 using 6N HCl.

Solid Phase Extraction
Empore TM Solid Phase Extraction Cartridges (3M, St. Paul, MN, USA) were used to remove debris from the sample prior to analysis. Briefly, cartridges were washed twice with 1 mL Acti- vation Buffer (80% acetonitrile, 20% water and 0.1 % trifluoroacetic acid (TFA)) then twice with 1 mL Wash Buffer (95% water, 5% acetonitrile, and 0.1 % TFA). Digested serum-derived protein samples were added to the column prior to centrifugation at 1,500 g, at RT. Bound peptides were washed twice with 1 mL Wash Buffer. Positive pressure centrifugation was applied at RT for processing liquids through disk cartridges (EBA 20, Hettich Zentrifugen, Tuttlingen, Germany) at 1500 x g. Eluted peptides were collected with 1 mL Activation Buffer into LoBind microcentrifuge tubes (Eppendorf, Hamburg, Germany), then dried (Savant TM, SPD131DDA SpeedVac, Thermo Fisher Scientific, Waltham, MA, USA) for 3-4 hours at RT. Lyophilized samples were stored at -80 °C until rpLC-MS/MS analysis.

Mass Spectrometry
Lyophilized sera-derived peptides were thawed on ice for 30 minutes and reconstituted in 100 μL of sterile, proteomics grade peptide sample buffer (95% water, 5% acetonitrile (ACN), and 0.1 % formic acid (FA)). Samples were filtered using 0.45 μm cellulose acetate microspin filters which was pre-washed with sample buffer. Thereafter, ten μL was transferred to glass HPLC vials (Waters, Milford, MA, USA). Rp-LC was performed using a binary high-pressure gradient pump UltiMate 30 0 0 RSLCnano system with a Dionex WPS-30 0 0 autosampler (Thermo Fisher Scientific, Germering, Germany) coupled to an EASY-Spray column. Data acquisition and gradient control was performed with Chromeleon, Version 7.0 (Dionex, Sunnyvale, CA, USA). Human sera peptides were concentrated and washed on a trapping pre-column (Acclaim PepMap C 18 , 75 μm × 2 cm nanoViper, 3 μm, 100 Å , Thermo Fisher Scientific), then separated using a C 18 reversed phase column (Acclaim PepMap RSLC C 18 , 50 μm × 15 cm nanoViper, 2 μm, 100 Å , Thermo Fisher Scientific) with linear gradient of 150 min from 2-95% of Eluent B (0.1% formic acid in 100% acetonitrile) in Eluent A (0.1% formic acid in 100% Water) at a flow rate of 300 nL/min. Rat sera peptide mixtures were fractionated on a RSLC C 18 column, 25 cm × 75 μm nanoViper, 2 μm, 100 Å , using a linear gradient of 150 min from 2-95% of Eluent B. MS/MS analysis was performed using an Orbitrap Fusion Lumos Tribrid mass spectrometer (Thermo Fisher Scientific, Waltham, MA, USA) in data-dependent positive ion mode with Xcalibur, v.1.0.2.65 SP2 (Thermo Fisher Scientific, Bremen, Germany). The scan range was 350 −1600 m/z for precursors (MS1), followed by charge-state determination and higher-energy collisional dissociation (HCD) scan carried out on the ten most intense ions, and data collection in profile mode. Monoisotopic peak determination, charge-state screening, and data dependent dynamic exclusion were enabled, with exclusion of non-assigned peptides, an intensity threshold of 3 × 10 4 , a repeat count of two, and an exclusion duration of 60s for ions ±10 ppm of the parent ion mass. The automatic gain control (AGC) settings were 4 × 10 5 and 2 × 10 5 ions for survey, and HCD modes, respectively. Scan times were set at 50 for survey mode and at 100 ms for HCD mode. For HCD, collision energy were set at 35%. Quadrupole isolation mode and the Orbitrap detector was used for both for survey mode (resolution = 120, 0 0 0) and HCD mode (resolution = 15, 0 0 0). All data was acquired is centroid mode and all runs were carried out in triplicate.

Database Search and Label Free Quantitation
RpLC-MS/MS data was analyzed using a pipeline implemented in Proteome Discoverer, version 2.2 (ThermoFisher Scientific, Bremen, Germany). Mass spectrometry .raw files were searched with MS Amanda (version 2.0) and Sequest HT against human or rat databases from UniPro-tKB/SwissProt (release 2018-06) with the following parameters: two tryptic missed cleavages; precursor mass tolerance ≤10 ppm; MS/MS mass tolerance ≤ 0.02 Da; charge states of + 2, + 3, and + 4; cysteine carbamidomethylation ( + 57.021 Da) as static modification, and methionine oxidation ( + 15.995 Da) as dynamic modification. Protein and peptide validation (FDR < 0.01%) was determined using Percolator. Label-free quantification was conducted using all peptides with a q-value of ≤ 0.01 and a peptide rank ≥ 1.