Proteomics Analysis of Cancer Exosomes Using a Novel Modified Aptamer-based Array (SOMAscanTM) Platform

We have used a novel affinity-based proteomics technology to examine the protein signature of small secreted extracellular vesicles called exosomes. The technology uses a new class of protein binding reagents called SOMAmers® (slow off-rate modified aptamers) and allows the simultaneous precise measurement of over 1000 proteins. Exosomes were highly purified from the Du145 prostate cancer cell line, by pooling selected fractions from a continuous sucrose gradient (within the density range of 1.1 to 1.2 g/ml), and examined under standard conditions or with additional detergent treatment by the SOMAscanTM array (version 3.0). Lysates of Du145 cells were also prepared, and the profiles were compared. Housekeeping proteins such as cyclophilin-A, LDH, and Hsp70 were present in exosomes, and we identified almost 100 proteins that were enriched in exosomes relative to cells. These included proteins of known association with cancer exosomes such as MFG-E8, integrins, and MET, and also those less widely reported as exosomally associated, such as ROR1 and ITIH4. Several proteins with no previously known exosomal association were confirmed as exosomally expressed in experiments using individual SOMAmer® reagents or antibodies in micro-plate assays. Western blotting confirmed the SOMAscanTM-identified enrichment of exosomal NOTCH-3, L1CAM, RAC1, and ADAM9. In conclusion, we describe here over 300 proteins of hitherto unknown association with prostate cancer exosomes and suggest that the SOMAmer®-based assay technology is an effective proteomics platform for exosome-associated biomarker discovery in diverse clinical settings.

Prostate carcinoma is the most frequent male cancer, with an estimated 240,000 newly diagnosed individuals and 28,000 deaths in the United States during 2012 (National Cancer Institute (NIH)). Methods for detecting this cancer are based on a combination of physical examination through digital rectal examination, clinical imaging, quantification of circulating levels of prostate specific antigen (PSA), 1 and transrectal ultrasound-guided biopsy. As a non-invasive test, PSA measurement is still widely used, but it remains insensitive, as around 15% of men with normal levels of PSA will have prostate cancer according to biopsy results (1), and 60% of men with elevated PSA levels may have other, noncancerous conditions but be subjected to further, unnecessary investigations and interventions (2). PSA may be of better utility in monitoring disease progression (2). An ability to diagnose the disease more specifically at an early stage is likely to save lives and alleviate the healthcare burden and morbidities arising from misdiagnosis. In addition, methods for monitoring the course of the disease in a non-invasive and perhaps predictive manner would offer increased patient benefit, enabling early detection of imminent relapse under hormone therapy, for example. Therefore there is a clinical need for improved molecular approaches for disease diagnosis and monitoring in these settings.
Small vesicles termed exosomes are present in body fluids, including serum, plasma, urine, and seminal plasma (3)(4)(5)(6)(7), and their isolation and examiniation may prove useful as a minimally invasive means of obtaining a complex set of disease markers. Exosomes are secreted by most, if not all, cell types and are generally accepted as derived principally from multivesicular bodies of the late endocytic tract (8), although examples of plasma membrane budding nanovesicles of similar phenotype have also been described (9). Exosomes are particularly enriched in membrane proteins and in factors related to such endosomal compartments. They also contain proteins found in the cytosol, but they poorly represent components of organelles such as the mitochondria, nucleus, and endoplasmic reticulum (10). Exosomes also comprise an assortment of coding and noncoding RNA. There has been considerable global effort toward defining disease-related alterations in exosomal RNA. However, it is well established that aberrant alterations in cancer cells in response to metabolic, hypoxic, or other forms of stress are reflected in protein changes in the exosomes produced (11)(12)(13). Thus exosomes from diseased origins can be distinguished from those of a normal phenotype based on their protein profiles alone.
Proteomics studies using mass spectrometry (MS) have previously been conducted on prostate cancer exosomes/ microvesicles obtained from cell lines (14,15), xenotransplantation models (16), or ex vivo biofluids (17). Hundreds of proteins with putative associations with exosomes/microvesicles have been identified. These studies highlight several interesting candidate markers of potential biomarker utility that are currently being explored. However, global proteomic approaches of this nature can have two major limitations. Although the most abundant proteins are more likely to be identified by MS, it is difficult to infer information about relative abundances of proteins in complex samples when using these methods. Secondly, given the often exacting, difficult-to-reproduce, and time-consuming workflows involved, such technologies are poorly suited for the analysis of a large number of samples. Multiplex protein array methodologies have the potential to overcome such issues and offer quantification and options for more rapid sample throughput. However, most platforms are based on antibodies, and these arrays are typically limited to Ͻ100 proteins, principally because the cross-reactivity of secondary antibodies can negatively affect assay specificity (18).
A recently developed proteomics platform, termed SOMAscan TM , provides a new generation of protein detection technologies. The platform is capable of the simultaneous quantitative analysis of 1129 proteins per sample in its current form. It is also an approach well suited to handling large numbers of specimens required for well-powered clinical studies (19). The key to this technology, which is described in detail by Gold et al. (20,21), is the use of slow off-rate modified aptamers (SOMAmers) containing chemically modified nucleotides. This confers greater stability, expanded target range, and improved affinity for the target proteins. This multiplex platform has been applied successfully to small volumes (ϳ15 l) of plasma specimens from chronic renal disease patients (20), serum specimens from mesothelioma (22) or lung cancer patients (19), tissue lysates (23), and cerebrospinal fluid (24). However, to date, the compatibility of this array technology with exosomes as the specimen has not been investigated.
The purpose of the current study was to examine the utility of this evolving technology in profiling the protein repertoire of exosomes. Research was conducted using highly pure exosomes isolated from a prostate cancer cell line, and we compared this sample to the protein profile of the parent cells. By so doing, we obtained evidence of the compatibility of the platform with this difficult, membranous sample and identified several proteins of previously unknown association with exosomes. In summary, SOMAscan TM is a versatile tool for probing the composition of exosomes and is a suitable platform to provide a high-throughput approach for exosome-based biomarker discovery in prostate cancer and other clinical settings.

EXPERIMENTAL PROCEDURES
Cell Culture-Du145 is a cell line originating from the metastasis of prostate carcinoma (25); the material used here was purchased from ATCC. The cells were seeded into bioreactor flasks (Integra, Nottingham, UK) and maintained at high-density culture for exosome production, as previously described (26). The cells were cultured in RPMI 1640 (Lonza, Wilford, Nottingham, UK) supplemented with penicillin/ streptomycin and 5% fetal bovine serum (FBS) that had been depleted of exosomes via overnight ultracentrifugation at 100,000g followed by filtration through 0.2-m and then 0.1-m vacuum filters (Millipore, Watford, UK). Cells were confirmed negative for mycoplasma contamination by monthly screening (Mycoalert, Lonza). Du145 lysates were prepared with MPER lysis buffer (Pierce) containing protease inhibitor mixture (Insight Biotechnology Ltd, Wembley, Middlesex, UK) following three washes in PBS and stored at Ϫ80°C. Protein concentrations were determined using a micro-BCA assay (Pierce/Thermo). Lysis buffer in the absence of cells was used as a background control in the array.
Exosome Purification-The culture medium of Du145 cells was subjected to serial centrifugation to remove cells (400g for 10 min) and cellular debris (2000g for 15 min). The supernatant was then centrifuged at 10,000g for 30 min to remove any remaining debris or large/dense vesicles. Exosomes were concentrated into a pellet from the supernatant by ultracentrifugation at 100,000g. The pellet was resuspended in 200 l of PBS and then overlaid on a freshly prepared continuous sucrose gradient (0.2 M up to 2.5 M sucrose) (8,27). This was centrifuged at 4°C overnight at 210,000g using an MLS-50 rotor in an Optima-Max ultracentrifuge (Beckman Coulter). Fifteen fractions of around 330 l each were collected, and the refractive index was measured at 20°C using an automatic refractometer (J57WR-SV, Rudolph Research Analytical, Hackettstown, NJ). The density of each fraction was calculated as previously described (8). An aliquot of each fraction (15 l or 100 l) was used for nanoparticle tracking analysis or for flow cytometric analysis, respectively. Based on these analyses, fractions were selected and pooled. After being washed in PBS, the pooled specimen was resuspended in 100 l of PBS, and 10 l was used to determine the protein concentration using the micro-BCA assay. The remainder was stored at Ϫ80°C ready to be shipped to SomaLogic (Boulder, CO) on dry ice. For both pilot and validation work, exosomes were purified by a simpler but related method. This involved ultracentrifugation of similarly pre-cleared culture media on a cushion of 30% sucrose/D2O that captured vesicles at a density of Ͻ1.2 g/ml. After 1 h, the middle of the cushion was collected and diluted in excess PBS before exosomes were pelleted at 100,000g (28,29).
Nanoparticle Tracking Analysis-A 15-l aliquot of each fraction was taken and particle counts and particle size distribution were determined using the Nanosight LM10 system (NanoSight Ltd, Amesbury, UK) configured with a 405-nm laser and a high-sensitivity digital camera system (OrcaFlash2.8, Hamamatsu C11440, NanoSight Ltd, Amesbury, UK). 30-s videos were taken and analyzed using NTA software (version 2.3), with the minimal expected particle size set to automatic and camera sensitivity and detection thresholds set to 14 and 3, respectively, to reveal small particles. Each fraction was diluted in nanoparticle-free water (Fresenius Kabi, Runcorn, UK) to a concentration between 2 ϫ 10 8 and 9 ϫ 10 8 particles per milliliter within the linear range of the instrument. A mock gradient where no sample was added was also analyzed, revealing negligible counts for particles related to the sucrose gradient (not shown). This mock sample acted as a background control for the SOMAscan TM array.
Analysis of Exosome Proteins-For characterizing the content of sucrose gradient fractions, a 100-l aliquot of each fraction was washed in 1.6 ml of MES buffer (0.025 M MES, 0.154 M NaCl, pH 6) and concentrated by ultracentrifugation at 120,000g prior to incubation with aldehyde-sulfate latex beads (3.9-m diameter, Invitrogen). After overnight coupling and blocking (with 1% (w/v) BSA/0.1% (w/v) glycine in MES buffer for 2 h at room temperature), beads were stained with primary monoclonal antibodies including anti-CD9 (R&D Systems, Abingdon, UK), CD81, CD63 (Serotec, AbD Serotec, Oxford, UK), MHC Class I (eBioscience, Hatfield, UK), prostate specific membrane antigen (Santa Cruz), and an isotype control (eBiosystems); they were used at 2 to 10 g/ml for 1 h at 4°C. After one wash, goat anti-mouse-Alexa-488-conjugated antibody (Invitrogen) diluted 1:200 in 0.1% (w/v) BSA/MES buffer was added for 1 h. After washing, beads were analyzed by flow cytometry as described elsewhere (29) using a FACSCanto instrument configured with a high-throughput sampling module and running FACSDiva v6.1.2 software (Becton Dickinson). The median fluorescence values of the histogram are shown in plots. In a similar fashion, substituting high-protein-binding ELISA plates for beads, purified exosomes were immobilized on plates (at doses of Յ10 g per well) overnight in PBS and blocked for 2 h in 1% (w/v) BSA/PBS. In some experiments, fractions from sucrose gradients, washed in PBS, were immobilized on the plates. Primary antibodies (at 2 g/ml) added for 1 h included anti-Tissue Factor, Glipican-3, CD36, uPA (Santa Cruz), RAC1 (Becton Dickinson), VEGF-A (Preprotech, London, UK), ADAM9, and Notch3 (R&D Systems). Detection was by goat anti-mouse biotinylated antibodies (PerkinElmer Life). To assess signal, we added Europium-streptavidin conjugate and, following six washes, measured it via time-resolved fluorimetry on a Wallac Victor-II multi-label plate reader (PerkinElmer Life). For some experiments, exosomes were pre-labeled overnight with primary antibody and, after an ultracentrifugation-based wash, labeled with biotinylated secondary antibody. After a second wash, samples were added to anti-CD9 antibody-coated plates and incubated overnight. After washing, Europium-streptavidin was added, and after six washes it was measured as described above. This served to demonstrate a co-localization of proteins with CD9. For some experiments, primary/secondary antibodies were substituted for individual biotinylated SOMAmersா, used at 10 nM in SB17/0.05% (w/v) Tween20 buffer, and washes were conducted in SB17 buffer containing 0.5% (w/v) BSA. Bound SOMAmersா were detected with Europium-streptavidin as described above.
Preparation of Samples for the SOMAscan TM Array-Exosome or cell samples were adjusted to a final concentration of 20 g/ml in SB17ϩTween buffer (102 nM NaCl, 5 mM KCl, 5 mM MgCl 2 , 1 mM EDTA, 40 mM Hepes, pH 7.5, 0.05% Tween-20) before being added to the SOMAscan TM array workflow. This represents the standard conditions under which the array operates with samples such as serum. To aid in the solubilization/liberation of proteins from vesicles that might otherwise be inaccessible to SOMAmersா, exosome and cell lysate samples were also prepared at a total protein concentration of 200 g/ml in SB17 buffer containing 1% (w/v) Nonidet P-40/0.5% (w/v) deoxycholate (DCO). Samples were incubated for 15 min at 37°C, centrifuged for 5 min at 14,000g, and then diluted 10-fold in SB17ϩTween buffer for analysis via SOMAscan TM assay. The selection of such conditions is described in supplemental Fig. S2. Samples were analyzed via SomaLogic Biomarker Discovery assay using an Agilent microarray read-out that measures 1129 proteins. This assay, summarized in supplemental Fig. S1, is similar to the earlier version 2 of the assay detailed by Gold et al. (20) and uses SOMAmersா to transform protein concentration into a corresponding DNA concentration through a series of steps involving affinity binding and capture of biotin onto streptavidin beads. The final DNA concentration is measured in relative fluorescence units (RFU) from the fluorescent SOMAmerா hybridized to a complementary probe on custom microarray slides.
Data Handling and Presentation-RFU output from the array was subjected to background subtraction. For exosomes, this involved the use of a mock sucrose gradient to which no exosomes were added. For cells, this was lysis buffer in the absence of cells. Each was diluted the same amount as the equivalent samples in SB17 buffer Ϯ 1% (w/v) Nonidet P-40/0.5% (w/v) DCO. For each condition (exosomes or cells in standard SB17 or in SB17 ϩ Np40/DCO buffer), samples were run in triplicate on the SOMAscan TM array (v3.0). For all figures except supplemental Fig. S2, where all evaluable identifications are shown, the data were trimmed. We initially filtered the dataset to remove a small number of proteins whose coefficient of variation among either experimental group was greater than 5. The data were then log-transformed, which gave them a normal distribution (confirmed via Shapiro-Wilkes test). The remaining data could therefore be assessed for significance in a row-by-row t test, correcting for multiple testing using the Benjamini-Hochberg procedure. All the proteins that were used in our analysis were significant at the 5% level after correction for multiple testing. Heat maps were generated using Gene-E (version 3.0.34, The Broad Institute, Cambridge, MA), and column clustering was performed using one minus Pearson correlation with the average linkage method. To discriminate presence from absence, a conservative cutoff RFU value of 200 was chosen, based on prior studies with the platform (20), and this threshold was used in the selection of candidates of interest and in comparisons with the published Vesiclepedia database. The Vesiclepedia database for MS-based exosome proteomics (30) contained 11 database entries corresponding to "human," "exosomes," and "prostate" as search terms, and from these a list of 532 gene names was compiled. The overlap between the array and Vesiclepedia was evaluated using BioVenn (31). Bar graphs were generated using GraphPad Prism version 4.00 for Windows (GraphPad Software, San Diego, CA). When protein identifications are represented as genes, some SOMAmers report with two or more gene names. For example, an individual SOMAmer may recognize a protein complex between ␣ and ␤ integrin chains. The gene names for ␣ and ␤ chains are reported here with the identical RFU value, as both ␣ and ␤ chains are required in order for a signal to be detected. Examples of identifications featuring this aspect are presented in supplemental Table S3. For the bioinformatics analyses, these ambiguous identifications were removed from the analysis. For comparisons with other MS-based studies, analyses were performed both including (supplemental Fig. S3) and excluding (supplemental Fig. S4) these ambiguous identifications, as annotated in the text.
Bioinformatics Analysis-The genes with the greatest fold changes were analyzed using the DAVID bioinformatics tool to see whether they were enriched for any particular biological themes (32,33). The DAVID database provides a functional analysis tool to determine whether a list of genes is enriched for a particular biological theme from a set of ontology-related resources. The gene lists of SB17 ϩ Nonidet P-40/DCO and SB17 conditions were each ranked in order of increased fold change in exosomes (decreasing from the highest). The aforementioned ambiguous identifications due to SOMAmers recognizing more than one gene were removed from the analysis. These reduced lists were then combined to create a single, uniqueentry background list for the DAVID analysis. We selected the 50 genes with the greatest fold changes in exosomes from both conditions (we investigated other gene list sizes, but there was little variation in the results for lists around this size), using Entrez gene accession numbers as the method of gene annotation, and analyzed them in DAVID against the background list that we had created. The resulting output of the significantly enriched biological themes, depicted in supplemental Table S4, was visualized in a network diagram format using Cytoscape (34) with the Enrichment Map plugin (35). We used a p value cutoff of 0.005, a false discovery rate Q-value cutoff of 0.1, and a gene overlap index cutoff of 0.5. To aid visualization, we used a simple clustering method based on shared gene membership between terms to group and color-code the nodes of the output and indicate their high percentage of shared genes within these groups. Each member of a cluster was, on average, at least 90% similar to the others in terms of gene content.

Compatibility of Exosomes with the SOMAscan TM Array
Platform-It was important to investigate various sample preparation conditions to maximize the signal for as many analytes as possible, as exosomes were a hitherto untested specimen type for the SOMAscan TM array system (schematically depicted in supplemental Fig. S1). Sucrose cushionpurified exosomes (supplemental Fig. S2A) were subjected to differing detergents and other reagents and run on an earlier beta version of SOMAscan TM v3.0, using v3.0 conditions and assay v2.0 SOMAmer mixes that contained 1034 SOMAmers with data analysis performed on 300 of these (supplemental Fig. S2B). The detergent conditions elevated the signal generated for many but not all analytes, and the signals for some analytes were negatively affected. The MPER and 1% (w/v) Nonidet P-40/0.5% (w/v) DCO conditions gave comparable results (supplemental Figs. S2B and S2C), but the average RFU output was highest with the 1% (w/v) NP4/0.5% (w/v) DCO condition, which was selected as the preferred method.
The addition of DTT, which aids MS-based analyses of exosomes (29), did not confer an advantage for this array platform and abrogated otherwise strong signals for several proteins (supplemental Fig. S2C). Whether the effects on signal reduction are due to the poor liberation of protein(s) from the vesicle or to denaturing of the epitope to which the SOMAmerா reagents binds remains unknown. Hence we chose to analyze exosomes under both standard buffer (SB17 ϩ Tween buffer alone) and 1% (w/v) Nonidet P-40/0.5% (w/v) DCO in SB17 ϩ Tween buffer conditions in order to avoid potential underestimation of protein levels due to such effects.
Purification of High-quality Du145 Exosome-We undertook a continuous sucrose gradient isolation of exosomes to ensure the highest possible quality of purified exosomes for analysis on the current v3.0 array. These well-established methods separate vesicles based on their flotation properties, and for exosomes this has been defined as between 1.1 and 1.2 g/ml density (8). Fifteen fractions were collected serially from the gradient, and the density was determined via refractometry. An aliquot of each fraction was evaluated by means of nano-particle tracking analysis for the presence of exosome-sized nanoparticles. With each fraction the density increased serially as expected, and nano-particle tracking analysis revealed that the majority of nano-particulate material was focused into fractions 8, 9, and 10 ( Fig 1A), which coincided with a density between 1.12 and 1.17 g/ml. Size distribution analysis revealed a monodisperse population of particles in these fractions with a mean hydrodynamic diameter of Ͻ150 nm (Fig. 1B). An aliquot of each fraction was coupled to aldehyde sulfate latex beads and stained for a number of exosome surface proteins including CD9, CD81, CD63, MHC Class I, and prostate specific membrane antigen. Flow cytometric analysis of the beads revealed peak expression of these proteins within fractions 8 to 10 (Fig. 1C). Fraction 11 stained strongly for CD9 but not the other markers, and it was found to contain relatively few particles in nano-particle tracking analysis, so it was not included in the pool. The specimen derived for downstream analysis was formed by pooling fractions 8, 9, and 10, which represented the correct density, size, and molecular phenotype for exosome vesicles. Material present in the other fractions, presumably containing nonexosomal constituents or dense aggregates of exosomes, was discarded.
Array Analysis of Du145 Exosomes-Exosomes under SB17 or SB17 ϩ Nonidet P-40/DCO conditions were analyzed in triplicate using the current and full version of SOMAscan TM v3.0, revealing advantageous RFU output following SB17 ϩ Nonidet P-40/DCO treatment for 229 analytes (based on an arbitrary elevation of Ն10%). The level of signal increase above standard conditions was variable from analyte to analyte, and in some cases the signal was elevated up to 15-fold. Some examples of proteins within this list are shown in supplemental Fig. S2D. For a further 199 analytes, the SB17 ϩ Nonidet P-40/DCO conditions did not have a major effect on the signal strength (Ͻ10% difference) (supplemental Fig.  S2E). However, for the majority of 698 analytes there was a loss in signal of Ͼ10% due to SB17 ϩ Nonidet P-40/DCO treatment (supplemental Fig. S2F). Based on these findings we chose to generate two protein lists for subsequent analysis: (i) SB17 ϩ Nonidet P-40/DCO conditions, taking RFU values of those elevated Ͼ10% by these conditions, and (ii) SB17 standard conditions, with RFU values taken for those remaining analytes not elevated Ͼ10% from samples in standard SB17 buffer.
Comparing Du145 Exosome with Cells-The identification of exosomal proteins that are enriched relative to parent cells is of considerable interest. Such information gives clues for potentially novel exosome functions and exosome manufacture. Furthermore, these may be the proteins of likely greatest utility in clinical applications such as exosome-based diagnostics. Thus we focused our attention on those identifications exhibiting increased expression in exosomes relative to parent cells. As certain identified proteins exhibited some variation in the triplicate measurements, we decided to filter the data to increase the likelihood of highlighting genuine exosomally enriched proteins (as detailed in "Experimental Procedures"). Corresponding gene name lists were generated from the remaining proteins for the SB17 ϩ Nonidet P-40/ FIG. 1. Preparation of highly pure Du145 exosomes for analysis via SOMAscan TM . Du145 exosomal vesicles were separated on a continuous sucrose gradient, and the density of 15 collected fractions was determined. Nanoparticle tracking analysis was performed on each fraction, and the particle concentration was plotted against the fraction density. Bars represent mean Ϯ S.D. of duplicate measurements (A). The size distribution of particles within each fraction is shown, and the density of each fraction is specified, revealing single-peak, monodisperse populations of small vesicles in fractions of classical exosomal density (between 1.1 and 1.2 g/ml) (B). A proportion of each fraction was coated onto latex microbeads, stained with antibodies as specified, and analyzed via flow cytometry. Bars represent median fluorescence values from 5000 events, and the positions of fractions 8 -10 are annotated (C). This characterization aided in selecting relevant fractions-specifically, F8, F9, and F10 -that were pooled for subsequent array analyses. Table S1) and SB17 conditions (supplemental Table S2).

DCO (supplemental
With respect to the SB17 ϩ Nonidet P-40/DCO list (depicted in Fig. 2A), we found 57 proteins clearly elevated (Ͼ2-fold) in exosomes relative to cells; a selection of these are shown in Fig. 2B. These included proteins such as MFG-E8, which was ϳ180-fold enriched in exosomes. The integrin ␣ V ␤ 3 receptor was also highly enriched (ϳ40-fold). The receptor MET, recently implicated in a metastasis priming function of melanoma-derived exosomes (36), exhibited less pronounced enrichment but nevertheless remained clearly elevated in exosomes (ϳ4-fold) relative to cells. Other proteins not particularly noted for their association with exosomes were also highly enriched, including factors usually secreted such as stanniocalcin-1 and inter-␣-trypsin inhibitor heavy chain family, member 4 (ITIH4). Some analytes exhibited comparable expression in exosomes and cells, including membrane proteins ALCAM (CD166) and amyloid precursor protein (Fig. 2C). 89 analytes were expressed in cells at levels Ͼ2-fold greater than the levels in exosomes, including thymidine kinase, peroxiredoxin-1, and the secreted glycoprotein galectin-8 (Fig. 2D).
Identification of those proteins that are simply elevated in exosomes relative to cells might be overly simplistic as a selection criterion for subsequent validation analysis. Plotting both fold-elevation and the mean RFU values (Fig. 2E) revealed several proteins such as NCAM-L1 (L1CAM) or LG3BP that may be well enriched in exosomes versus cells (ϳ30-fold or ϳ8-fold, respectively) yet exhibit relatively low RFU values (Ͻ300 RFU). This scenario suggests that the protein abundance is low and possibly difficult to detect. In contrast, several proteins exhibited good enrichment together with high RFU values, and this might be a basis for prioritizing markers of interest. Well-known exosomal proteins, including MFG-E8, integrins ␣v␤3, DAF (CD55), ␤2 M (a component of MHC Class I), and ICAM-1, fit this criterion well; multiplying fold enrichment with log 2 (RFU) provides a simple scoring system that highlights proteins both enriched and abundant in exosomes (Fig. 2F). Analytes exhibiting a high score thus may be good candidate proteins for the selective detection, or perhaps physical capture, of exosomes in other assay systems.
With respect to the standard assay conditions (depicted in Fig. 3A), 33 analytes were found to be elevated Ͼ1.5-fold in exosomes relative to cells, although the magnitude of enrichment in this list was less marked, with G-CSF exhibiting the greatest difference (of 25-fold) (Figs. 3B and 3E). Other enriched proteins included angiogenesis-promoting factors such as angiogenin, VEGF-A, and the inflammatory cytokine IL-8 and migration-related proteins Rac1 and Moesin. 24 analytes were expressed at similar levels in cells and exosomes, including cyclophilin A, cathepsin D, and proprotein convertase subtilisin/kexin type 9 (Fig. 3C), but the majority of analytes (574) were more strongly expressed in cells. These in-cluded HSP60, the immune regulatory molecule PD-L2 (CD273), the nuclear/cytoplasmic shuttling protein hnRNP A/B, and the anti-angiogenic collagen fragment endostatin (Fig. 3D). All those that were elevated Ͼ1.5-fold above cells and with an RFU of Ͼ200 are shown in Fig. 3E, and the same scoring system is plotted in Fig. 3F to aid in the selection of candidates of interest.
Possible Anomalous Identifications-Overall the data generated from the array seemed biologically plausible, as proteins that we would simply not expect to be present within the sample exhibited low or negligible RFU values. Given the nature of the purification process, we would not expect significant contamination of sucrose gradient fractions with FBSderived material, but some reports have suggested that a degree of contamination with blood proteins is inevitable when using more complex ex vivo sources of exosomes isolated by such gradients (7). We manually explored the lists for examples of abundant serum/plasma proteins and found that RFU values for albumin, IgE, IgD, complement components (C5, C9, C3b), and coagulation factors (F5, F9, F10, F11) were low or negligible and well below our criterion for accepting a positive identification of 200 RFU. There were examples of some components with high RFU values, such as the deactivated form of complement C3 (C3a des-arginine anaphylatoxin (RFU 9576, SomaID; SL003220)). C3 is often found in MS-based proteomics of exosomes, and low levels of mRNA for C3 are detectable in these cells (data not shown); this may therefore be a genuine identification (30). Nevertheless, given the high variation in replicates for C3a-des-arg and many others, they were excluded from our final list.
Protein-S is another highly abundant blood protein detected with high RFU values and good replicates that appeared as enriched in exosomes relative to cells. This protein was therefore allocated to the candidate list (Fig. 3F) and was certainly unexpected on initial examination as an exosomally expressed protein. Although protein-S is principally recognized as a modulator of coagulation, it also has less wellknown functions in binding phosphatidylserine and aiding phagocytosis (37) in a manner similar to that of MFG-E8, which functions in aiding exosome uptake by dendritic cells (38). In addition, the expression of protein-S by prostate cancer cell lines has been documented (39), and we found mRNA for this in the Du145 cells (data not shown). Overall, the presence of FBS-derived material contributing to the array findings is therefore unlikely, and this finding suggests that the candidates identified by the scoring system (above) are certainly plausible as exosomally associated proteins.
Validation of SOMAmerா-identified Proteins-From the scoring system (presented in Figs. 2F and 3F), we compiled a list of candidate analytes and sought to determine whether these proteins could be detected in purified exosomes via other methods.
We first examined the capacity of SOMAmersா to bind to exosomes in a monoplex rather than multiplex fashion. This . The data were filtered to include only analytes reporting Ͼ1.5-fold elevation in exosomes and those with an RFU signal of Ͼ200 units, and these are shown (E), plotted according to fold enrichment (left axis) and RFU values (right axis). A simple multiplication of fold-increase ϫ log2(RFU), used as a means of identifying proteins that may be both enriched and highly abundant in exosomes, is shown (F). Some candidate proteins were selected from this plot for subsequent validation analyses, and these are indicated with †. would serve to show that signals measured by SOMAscan TM are independent of any anomalous interactions between SOMAmersா arising from the mixture of 1129 SOMAmersா present in the mixture used in the assay. Using sucrose cushion-purified exosomes immobilized to microtiter ELISA plates, individual SOMAmersா containing a biotin tag were added, and following incubation and washes, bound SOMAmersா were detected using streptavidin-conjugated Europium by means of time-resolved fluorimetry (Fig. 4A). In this assay the signal for ROR1, Notch3, ADAM9, ITIH4, HAI-1, and others was above those of irrelevant control SOMAmersா (KDGL or Spuriomer). The signal strength, however, did not always fit with the array data; for example, we might have expected the signal for ADAM9 to be greater than that for ROR1 or HAI-1. Such monoplex assays using SOMAmersா in this configuration certainly require further optimization involving considerable investment and laboratory resources, but nevertheless they act to confirm the ability of these novel reagents to detect exosomally expressed proteins using more widely available laboratory tools.
In a similarly configured assay, we stained for some arrayidentified proteins using antibodies instead of SOMAmersா. The analytes CD36, ADAM9, Notch3, and tissue factor were readily detected, and a weak yet positive signal was seen for RAC1 and glipican-3 that remained well above staining using IgG-control antibodies (Fig. 4B). To confirm that some of these proteins float at typical exosomal densities, we fractionated exosomes by ultracentrifugation on a continuous gradient and analyzed fractions using the microplate approach as described above. This revealed data similar to those presented in Fig. 1C, with peak expression of Notch-3, RAC1, uPA, and VEGF apparent within the classical density range of 1.1 to 1.2 (Fig. 4C). A commercial sandwich ELISA for VEGF-A was also used and confirmed a dose of 3 pg of VEGF-A per 1 g of Du145 exosomes (not shown).
Given that these proteins were selected for their apparent enrichment in exosomes relative to cells, we investigated whether this was true by subjecting exosomes and cell lysates, corrected for total protein, to Western blotting. The multivesicular endosomal protein TSG101 was used as a positive control because it is clearly enriched in exosomes and is a recognized marker for the compartments giving rise to exosomes. The opposite pattern is seen when staining for the endoplasmic reticulum marker calnexin, which is much more abundant in cells, and thus this was used as a further control. In parallel, RAC1, which was enriched ϳ3-fold according to the array, was shown to be elevated in exosomes by Western blotting. Similarly, ADAM9, tissue factor, and DAF, with 6-, 9-, and 21-fold enrichment, respectively, were clearly preferentially expressed by exosomes. Tissue factor and DAF are incidentally known as exosomally associated proteins and serve as additional evidence here of a repertoire of proteins consistent with what is already known about exosomes (40,41). The array identified enrichment for L1CAM (NCAML1) and NOTCH3 by 34-and 126-fold, respectively. Although these were readily detected in exosomes, they appeared undetectable in cells via Western blotting even at 50 g per lane (Fig.  4D), which was in agreement with the high level of enrichment identified by the array. Although floatation on sucrose is a valid means of discriminating vesicles from proteins that may co-isolate during the high-speed ultracentrifugation of vesicles, it remains theoretically possible that the material identified might not genuinely be part of the vesicle structure. To this end, we captured exosomes on plates using anti-CD9 antibodies and detected positive signals for some of the identified proteins (ADAM9, Notch3, and RAC1). This shows co-localization of CD9 with these proteins and, together with floatation properties, points to their presence in exosomal vesicles.
In conclusion, many of the candidates of interest identified by this vast multiplex array technology have been confirmed to be present on exosomes via other methods, and several of these were confirmed in this study to be concentrated in vesicles in comparison with the parent cell. The platform therefore provides protein identifications that can be verified through more traditional approaches.
Comparison with Vesiclepedia Entries-In order to determine how well the SOMAscan TM data fit with previously performed proteomics analyses of exosomes, we queried the Vesiclepedia database, which curates MS proteomics and other types of analyses of vesicles including exosomes (30). We searched with the terms "human," "exosome," and "prostate" and found 11 database entries for proteins, generating a total of 532 unique entries. These identifications were compared with the SOMAscan TM identifications with an RFU of Ͼ200. The comparison with Vesiclepedia is summarized in supplemental Fig. S3 and shows that of the 532 proteins present in the Vesiclepedia-derived dataset, 91 were identified by SOMAscan TM . Therefore, 19% of the proteins positively identified by SOMAscan TM are confirmed by previous MS studies (supplemental Fig. S3B). Of the entire array coverage, 26 proteins (2.6% of the identifications) have been found in previous studies but fell below the threshold for consideration as positive identifications (supplemental Fig.  S3C). Importantly, however, 392 protein identifications were unique to the SOMAscan TM discovery platform (supplemental Fig. S3D) and represent proteins that are not found in Vesiclepedia curated studies. A similar analysis was performed following the removal of ambiguous identifications arising from protein to gene name conversion (as listed in supplemental Table S3), and this led to slight amendment of the above figures to 363 unique SOMAmer-based identifications with 87 also present in the Vesiclepedia dataset (supplemental Fig.  S4).
Bioinformatics Analysis of Array Data-In order to examine the biological information provided by the array, we used the DAVID bioinformatics tool to explore biological themes related to array identifications that were elevated in exosomes relative FIG. 4. Confirmation of the expression of selected SOMAscan TM -identified proteins. Du145 exosomes purified via the sucrose cushion method were immobilized at specified doses on ELISA microplates and probed using individual SOMAmersா (A) or an indirect staining method with antibodies (B). Signals were detected using Europium-streptavidin and time-resolved fluorimetry (TRF) in each case (bars represent mean Ϯ S.E. of duplicate measurements). SOMAmersா (KDGL or Spuriomer) act as irrelevant controls for nonspecific binding (A). To confirm that the identified proteins float at a classical exosomal density, a continuous sucrose gradient fractionation was performed, and the density of collected fractions was determined prior to immobilization on microplates as described above. Proteins were detected with antibodies using the same indirect staining method as described above. The fractions with densities between 1.1 and 1.2 g/ml are annotated (C). Whole cell lysates and exosomes normalized for protein were subjected to SDS-PAGE and Western blotting and probed with antibodies as indicated. This revealed relative exosomal enrichment for all candidates, whereas calnexin exhibited the reverse pattern (D). Sucrose cushion-purified exosomes were labeled in solution with primary antibody (as specified) and secondary biotinylated antibody before being immobilized on microplates precoated with anti-CD9 or isotype control antibody. After washing, signals were detected using Europium-streptavidin and TRF (bars represent mean Ϯ S.E. of quadruplicate measurements) (E).
to cells. The networks generated highlight several terms we would expect from our current understanding of exosomes. These are dominated by multiple terms related to membrane-associated proteins, including "intrinsic to plasma membrane," "transmembrane," "GPI-anchor," and "disulfide bond" (Fig. 5A), which is a particular trait of exosomes (42). There are also terms related to the extracellular environment, including "secreted" and "extracellular region," and together with "vesicle lumen" and "cytoplasmic membranebounded vesicle lumen" (Fig. 5B) these are consistent with a secreted, membrane-bound vesicle carrying membrane proteins. There have been extensive studies of endocytosis of the EGF-receptor and its subsequent intracellular processing and degradation, and EGF-receptor has been used as a means of tracking exosome biogenesis (43) and the physiological dissemination of exosomes in biofluids (44). Thus terms related to the EGF axis (Fig. 5A), especially for cancerous epithelial exosomes, are not unexpected. Terms such as "platelet alpha granule" (Fig. 5B) might initially be surprising, as the sample analyzed was not platelet-derived. However, platelet ␣ granules have a structure similar to that of multivesicular endosomes and demonstrate intraluminal nanovesicular structures that have been termed exosomes (45). Terms related to complement and coagulation are also increasingly recognized as features of extracellular vesicles and are again not entirely unexpected from the analysis (46,47). However, a cluster of gene ontology terms suggesting protease inhibitor activity (Fig. 5B) is not to our knowledge an aspect previously re- FIG. 5. Biological themes related to exosome-enriched proteins. Network diagrams showing the relationship of biological themes that were significantly enriched in the list of 50 genes with the greatest fold change in exosomes relative to cells for the SB17 ϩ Nonidet P-40/DCO conditions (A) and the SB17 conditions (B). Edge thickness indicates the number of genes shared between terms. Key: GO, Gene Ontology; IPR, InterPro; SM, smart domains; the remaining terms are UniProt derived. The DAVID terms used to derive these diagrams can be found in supplemental Table S4. ported for exosomes and might represent a novel aspect of the data arising from the SOMAscan method. This appears to contradict a recent study suggesting a proteolysis-promoting function for exosomes secreted by some cancer cells, at least with respect to targeting matrix (48). The control of proteolysis, therefore, might be an additional feature of cancer exosomes to consider with respect to future biomarker and functional studies. Of note, although proteins relating to cell death and cellular compartments including the nucleus, mitochondria, and others are well covered by the array, these do not feature in these analyses, emphasizing that the sample analyzed was pure and devoid of contaminating cell-derived material.
In summary, this analysis shows that although the SOMAscan TM assay remains a closed platform, it provides a breadth of biologically relevant information that agrees well with MS-based analyses of exosomes and with our current understanding of such vesicles. DISCUSSION We present a proteomics analysis of exosomal vesicles conducted using the novel SOMAmerா-based proteomics assay SOMAscan TM . We have identified over 300 proteins that, according to the Vesiclepedia database (30), have not been previously assigned to exosomes of prostate association. Moreover, we have looked at the relative levels of the proteins in exosomes and in the parent cell and have highlighted several novel proteins with clear enrichment in the vesicles. This has been confirmed via more traditional approaches for several identifications. This technological approach, which has not previously been used for exosomes, confers many advantages over traditional proteomics tools. In particular it addresses the important question of relative protein abundance across complex samples. The platform provides future options for high-throughput analysis of exosomes isolated from clinical specimens that surpass the current state of the art in mass spectrometry.
Dealing with membranous samples is notoriously difficult because of problems regarding the insolubility of highly hydrophobic transmembrane proteins. Sample preparation/ solubilization for exosomes therefore needs particular attention in this respect, and robust protocols including the use of denaturing agents followed by a solvent-precipitation clean-up step add to the cumbersome nature of these workflows (29). In contrast, the SOMAmersா utilized in the SOMAscan TM assay have been through an involved and robust selection process (20) and were chosen for their ability to bind native tertiary protein structures (20,21). They bind selectively to their targets in the context of highly complex protein mixtures in solution (such as serum) (19). This might not necessarily be compatible with identifying proteins embedded within a membrane or encapsulated within a vesicle lumen, and some steps might be required in order to allow SOMAmersா to access such proteins. Several detergents included in the stand-ard buffer system demonstrated a negative effect on SOMAmerா binding function, with a reduction in signal when we might have expected improved exosome solubilization and hence elevated signals. We could not find a single sample solubilization method that would satisfy both protein liberation and SOMAmerா function for all of the SOMAmersா in the array. However, future approaches-perhaps screening other detergents/combinations, using physical methods such as sonication, or adding detergent-removal steps-might provide a means of exosome analysis without the need to run specimens under two conditions in parallel as we have done here.
With respect to the chosen sample treatment approach, intraluminal or transmembrane proteins such as Delta-like 4, IGF-II receptor, and cytosolic tyrosine kinase-TYK2 certainly gave elevated signals in the presence of added detergent. However, the partitioning between expected surface accessible versus intraluminal constituents was certainly not absolute, as some proteins that one might envisage to be encapsulated, including ubiquitin and lactate dehydrogenase, appeared with higher signals under the standard assay conditions. The presence of the detergent Tween-20 in the standard buffer used for the SOMAscan TM assay is likely to contribute to the penetration of the exosome membrane to some extent, and this might partly explain why such luminal constituents were identified in the standard SB17 conditions. Therefore it is not possible with the presented data to assign a precise vesicular location to the identified proteins, but this is an aspect that can be addressed by other methods such as flow cytometry or microplate-based assays.
Among the proteins identified in exosomes were cyclophilin-A, lactate dehydrogenase, and GAPDH. These were reported with very high RFU values, suggesting they are relatively abundant constituents of the vesicles. Such "housekeeping" proteins are likely to be coincidentally included during exosome biogenesis by virtue of their very high cellular abundance. There are, however, many examples of proteins detected in cells but reported as negative in the exosome specimens. Examples include apoptosis-related proteins (apoptosis regulator Bcl-2, apoptosis regulator Bcl-X, mitogen-activated protein kinase 8) and nuclear/mitochondrial proteins (mediator complex subunit 1, peptidylprolyl isomerase E, TATA-box-binding protein, Pescadillo, 3-hydroxyacyl-CoA dehydrogenase type-2, DnaJ homolog subfamily C member 19, p21-activated kinase 7). This therefore points to a specimen that does not represent these cellular compartments well and is unlikely to be related to apoptotic debris. Similarly, several examples of proteins present in exosomes are clearly proportionally less abundant in the parent cell; this might indicate an important function for such proteins in the context of their vesicular association. Examples include Hsp70, MFG-E8, and DAF, which are known to function in exosome-mediated immune modulation (13,38,40,49,50). This phenomenon of enrichment of certain proteins during exosome biogenesis was described originally by Johnstone et al. for transferrin receptor (51,52) and is now a well-recognized feature of exosomes, but quantifying the degree of enrichment has historically been a challenge in the field. Traditionally, comparison of exosomes with their parent cells has largely relied on Western blotting to demonstrate enriched proteins, but performing such analyses for hundreds of candidates is of course impractical. Our presented study highlights and quantifies several striking examples of enrichment, exemplified by Notch3 (126-fold), MFG-E8 (188-fold), and ITIH4, which exhibited the greatest exosomal enrichment (365-fold).
The mechanisms for concentrating such components so efficiently into exosomes remain poorly understood. The endosomal sorting complex required for transport (ESCRT) machinery has long since been implicated in targeting ubiquitinated proteins into exosomes (53). Although ubiquitin is certainly an abundant component of exosomes, not all exosomal proteins are subject to ubiquitination, and alternative ESCRT-independent mechanisms of multivesicular body biogenesis have been described (43). Of note, SOMAscan TM has identified known and many novel exosome-associated proteins that should predominantly pass through the Golgi for secretion via classical routes, including G-CSF, VEGF, IL-8, TGF␤1, BMP-14, and a host of others. It is currently unclear whether these and other classically secreted components can also undergo unconventional exocytic transport via exosomes (54) or whether such components associate with exosomes while present in the extracellular space. We formally demonstrated that soluble TGF␤1 does not co-isolate at the same density as exosomes separated on sucrose gradients (28); thus we doubt that such soluble proteins are present due to accidental co-isolation. Nevertheless, this evolving area raises the intriguing prospect of exosomes acting as a physiological mechanism for delivering in concert a complex repertoire of growth factors/cytokines to recipient cells and directing unique cellular responses perhaps not possible with soluble factors (28).
Although the protein coverage of the SOMAscan TM assay remains superior to that of antibody-based array platforms and is in fact expanding to include around 200 to 500 new analytes per year, it nevertheless gives us a rather narrow window into the vast human proteome. In contrast, MS platforms have an advantage here, as they are open to potentially identifying any protein, albeit with reasonably high expression levels, and highlighting subtle post-translational modifications. The SOMAscan TM technology therefore presents an array-dependent bias in the data arising. Nevertheless, probing Gene Ontology, focusing on proteins elevated in exosomes relative to cells, raises a host of biological themes consistent with our current knowledge of exosome vesicles, including terms related to membrane proteins or to the extracellular environment, and lacking terms related to cell death and compartments such as the nucleus. Although the specific identifications may differ, the overall biological information generated by this closed array platform compares well with information generated by an open methodology such as MS.
Demonstration here of the utility of the SOMAscan TM assay with exosomes opens the door for the discovery of new protein markers for prostatic cancer and any other disease. However, some challenges remain to be addressed, as proteomics approaches do require exquisite purification of exosomes from complex biofluids, which is difficult to achieve (7). Characterizing abundant surface-available exosomal proteins such as Notch-3 and ADAM9, for example, might aid in the development of cancer-selective affinity-based isolation strategies to aid this aspect. Interest in the utility of exosomes as a source of disease biomarkers continues to grow at pace, but there are few well-powered studies in any disease setting available, as these are difficult to achieve using conventional MS-based approaches. The SOMAscan TM platform is a tool with enormous potential in this regard, as it is sensitive, semiquantitative, and suitable for large sample sets. Overall it is an excellent and valuable addition to the repertoire of proteomics platforms currently available.