Quantitative proteomic profiling of the extracellular matrix of pancreatic islets during the angiogenic switch and insulinoma progression

The angiogenic switch, the time at which a tumor becomes vascularized, is a critical step in tumor progression. Indeed, without blood supply, tumors will fail to grow beyond 1 mm3 and are unlikely to disseminate. The extracellular matrix (ECM), a major component of the tumor microenvironment, is known to undergo significant changes during angiogenesis and tumor progression. However the extent of these changes remains unknown. In this study, we used quantitative proteomics to profile the composition of the ECM of pancreatic islets in a mouse model of insulinoma characterized by a precisely timed angiogenic switch. Out of the 120 ECM proteins quantified, 35 were detected in significantly different abundance as pancreatic islets progressed from being hyperplastic to angiogenic to insulinomas. Among these, the core ECM proteins, EFEMP1, fibrillin 1, and periostin were found in higher abundance, and decorin, Dmbt1, hemicentin, and Vwa5 in lower abundance. The angiogenic switch being a common feature of solid tumors, we propose that some of the proteins identified represent potential novel anti-angiogenic targets. In addition, we report the characterization of the ECM composition of normal pancreatic islets and propose that this could be of interest for the design of tissue-engineering strategies for treatment of diabetes.


SUPPLEMENTARY METHODS iTRAQ labeling
Desalted peptides were labeled with 4-plex iTRAQ reagents as directed by the manufacturer (AB Sciex, Foster City, CA), where 1 unit of labeling reagent was used for each time-point sample.
80 µg dried aliquots of each of the 4 time points (normal, hyperplastic islets, angiogenic islets, and insulinomas) for an experiment were reconstituted in 30 µL 1 M triethylammoniumbicarbonate (TEAB). 70 µL ethanol were added to each sample. 1 unit of iTRAQ reagent (~20 µl) was added to each sample, mixed and incubated at room temperature for 1 hour. Two microliters of each sample were used to check label incorporation by LC-MS/MS prior to quenching the reaction. Unquenched bulk samples were stored at -80°C. After verifying that labeling efficiency was satisfactory (>95% label incorporation), the reactions were quenched by adding 5µl 1M Tris pH 8 for a final concentration of ~50 mM and incubating at room temperature for 15 minutes prior to mixing the samples. For experiment 1 the initial label incorporation was unsatisfactory (70-90%), with the normal and hyperplastic islet samples being the lowest. For re-labeling, the frozen bulk samples were dried down to 30 µL, followed by addition of 70 µL ethanol and another unit of iTRAQ reagent as described above. Labeled samples of 4 different time points were mixed together, dried down and desalted using Oasis HLB 1cc (30mg) reversed-phase cartridges as previously described for post-digestion clean up 1,2 . Eluates were reduced in volume to near dryness and stored at -80°C.

RP)
Desalted 4-plex iTRAQ-labeled peptide mixtures for each experiment were reconstituted in 540 µL of 20 mM ammonium formate/2% acetonitrile pH 10, loaded on a Zorbax 300 Extend 2.1 x 150 mm column (Agilent Technologies, Santa Clara, CA), and fractionated on an Agilent 1100 Series HPLC instrument by basic-reversed-phase chromatography at a flow rate of 200 µL /min.
Mobile phase consisted of 20 mM ammonium formate/2% acetonitrile pH 10 (buffer A) and 20 mM ammonium formate 90% acetonitrile pH 10 (buffer B). After loading 500 µL of sample onto the column, the peptides were separated using the following gradient: 5 min. isocratic hold at 0% B, 0 to 15% solvent B in 8 min.; 15 to 28.5% solvent B in 33 min.; 28.5 to 34% solvent B in 5.5 min.; 34 to 60% solvent B in 13 min., for a total gradient time of 64.5 min. Using 96 x 2mL-well plates (Whatman, #7701-5200) fractions were collected every 0.77 min, 154 µl for a total of 64 fractions through the main elution profile of the separation. The extreme early and late portions of the gradient were collected into two additional larger volume fractions, but not further analyzed. For each experiment all fractions were acidified to a final concentration of 1% formic acid and the fractions were then recombined by pooling every 8 th fraction in a step-wise concatenation strategy, as previously reported 3 , to yield a total of 8 fractions per experiment. All fractions were dried by vacuum centrifugation and stored at -80°C until mass spectrometric analysis.
Chromatography was performed on a 75 μm ID picofrit column (New Objective, Woburn, MA) packed in house with Reprosil-Pur C18 AQ 1.9 μm beads (Dr. Maisch, GmbH, Entringen, Germany) to a length of 20 cm. Columns were heated to 50°C using column heater sleeves (Phoenix-ST) to prevent overpressuring of columns during UHPLC separation. The LC system, column, and platinum wire to deliver electrospray source voltage were connected via a stainlesssteel cross (360μm, IDEX Health & Science, UH-906x). The mobile-phase flow rate was 200nL/min and comprised of 3% acetonitrile/0.1% formic acid (Solvent A) and 90% acetonitrile / 0.1% formic acid (Solvent B). A 124-minute LC-MS/MS method followed a 10-minute column-equilibration procedure and a 6-minute sample-loading procedure for a 1 µL injection.
The elution portion of the LC gradient was 0-5% solvent B in 2 min., 5-35% in 90 min, 35-59% in 12 min., 59-90% in 2 min., and held at 90% solvent B for 10 min. to yield ~12 sec. peak widths. Data-dependent LC-MS/MS spectra were acquired in ~2 sec. cycles; each cycle was of the following form: one full Orbitrap MS scan at 60,000 resolution followed by 12 HCD MS/MS scans in the Orbitrap at 15,000 resolution using an isolation width of 2.5 m/z. Dynamic exclusion was enabled with a mass width of +/-20 ppm, a repeat count of 1, and an exclusion duration of 50 seconds. Charge-state screening was enabled along with monoisotopic precursor selection to prevent triggering of MS/MS on precursor ions with unassigned charge or a charge state of 1.
For HCD MS/MS scans the normalized collision energy was 33, AGC target 50,000 ions, and max ion time 200 msec.

Protein identification
All MS data was interpreted using a fully automated workflow in Spectrum Mill software Peptide spectrum matches for individual spectra were automatically designated as confidently assigned using the Spectrum Mill autovalidation module to apply target-decoy-based falsediscovery rate (FDR) scoring threshold criteria via a two-step auto-threshold strategy at the peptide and protein levels. First, peptide autovalidation was done for each experimental replicate of 8 LC-MS/MS run using an auto-thresholds strategy with a minimum sequence length of 6, automatic variable range precursor mass filtering, and score and delta Rank1 -Rank2 score thresholds optimized to yield a spectral level FDR estimate for precursor charges 2 thru 4 of <1.6% for each precursor charge state in each LC-MS/MS run. For precursor charge 5, thresholds were optimized to yield a spectral level FDR estimate of <0.8% across all 8 runs per experiment (instead of each run), to achieve reasonable statistics since many fewer spectra are generated for the higher charge state. Second, protein polishing autovalidation was applied to further filter all the peptide-level validated spectra with the primary goal of eliminating peptides identified with low scoring peptide spectrum matches (PSMs) that represent proteins identified by a single peptide in a single sample, so-called one-hit wonders. The following parameters were used; minimum number of experiments protein group is observed in: 1, minimum protein score: 15, and maximum protein FDR: 0%. After assembling protein groups from the autovalidated peptides for an experiment, protein polishing determined the maximum protein-level score of a protein group that consists entirely of distinct peptides estimated to be false-positive identifications (PSMs with negative delta forward-reverse scores). Then PSMs were removed from the set obtained in the initial peptide-level autovalidation step if they contribute to protein groups that have protein scores at or below the larger of the minimum protein score and the max false-positive protein score. A protein group would be estimated to be a false-positive if it was identified entirely on the basis of peptides estimated to be false positives. None of these remain after the thresholding in the protein-polishing step. In the filtered results each identified protein was detected with multiple peptides unless a single excellent scoring peptide was the sole match.
These autovalidation steps yielded a spectrum level FDR estimate of < 0.7% and a peptide level FDR estimate of < 1.1% for each experiment. In aggregate across both experiments the estimated FDRs are at the spectrum level: 0.64%, at the peptide level: 1.22%, and at the protein level: <0.03% (1/3701). Since the protein-level FDR estimate neither explicitly requires a minimum number of distinct peptides per protein nor adjusts for the number of possible tryptic peptides per protein, it may underestimate false positive protein identifications for large proteins observed only on the basis of multiple low scoring PSMs.
In calculating scores at the protein level and reporting the identified proteins, redundancy is addressed in the following manner: the protein score is the sum of the scores of distinct peptides.
A distinct peptide is the single highest scoring instance of a peptide detected through an MS/MS spectrum. MS/MS spectra for a particular peptide may have been recorded multiple times, (i.e. as different precursor charge states, in adjacent bRP fractions, or different modification states) but are still counted as a single distinct peptide. When a peptide sequence >8 residues long is contained in multiple protein entries in the sequence database, the proteins are grouped together and the highest scoring one and its accession number are reported. In some cases when the protein sequences are grouped in this manner there are distinct peptides which uniquely represent a lower scoring member of the group (isoforms, family members, or different species). Each of these instances spawns a subgroup and multiple subgroups are reported and counted towards the total number of proteins. Peptides shared between subgroups were counted toward each subgroup's count of distinct peptide-and protein-level iTRAQ quantitation. As listed in Supplemental Table 1A, assembly of confidently identified PSMs from both experiments into proteins yields 4135 total protein subgroups from 3701 protein groups.
The raw mass spectrometry data and the sequence database used for searches have been deposited in the public proteomics repository MassIVE and are accessible at ftp://MSV000080124@massive.ucsd.edu.
We further used the matrisome classification we previously defined 4 to categorize all of the identified protein subgroups as being ECM-derived or not (Supplemental Table 1B).

Protein quantitation
Relative protein quantitation was done using iTRAQ ratios for the 4 time points (normal islets, hyperplastic islets, angiogenic islets, and insulinomas). Reporter-ion intensities were corrected for isotopic impurities in the Spectrum Mill protein/peptide summary module using the static correction method and correction factors obtained from the reagent manufacturer's certificate of analysis for lot number A2157: http://sciex.com/Documents/Downloads/Certificates of Analysis/Certificates of Analysis for iTRAQ Reagents/iTRAQ-Reagent-Multiplex-Kit-4352135-A2157.pdf. Spectrum Mill used the reporter-ion intensities to calculate the iTRAQ ratios for each PSM. A protein-level iTRAQ ratio was calculated as the median of all PSM level ratios contributing to the protein remaining after excluding those PSMs lacking an iTRAQ label, having a negative delta forward-reverse score (half of all false-positive identifications), or having a precursor-ion purity < 50% (MS/MS has significant precursor isolation contamination from coeluting peptides). To account for differences in ECM protein amount in between single time point samples within one iTRAQ 4-plex experiment, all iTRAQ time-point ratios were normalized for the ECM-population median in the dataset. It is important to note that protein abundance ratios measured with iTRAQ quantitation can be compressed by a factor of 20-30% due to co-isolation interference and that real effect sizes might be larger than what was measured 5 .