Differential ultracentrifugation enables deep plasma proteomics through enrichment of extracellular vesicles

Human plasma is a rich source of biomedical information and biomarkers. However, the enormous dynamic range of plasma proteins limits its accessibility to mass spectrometric (MS) analysis. Here, we show that enrichment of extracellular vesicles (EVs) by ultracentrifugation increases plasma proteome depth by an order of magnitude. With this approach, more than two thousand proteins are routinely and reproducibly quantified by label‐free quantification and data independent acquisition (DIA) in single‐shot liquid chromatography tandem mass spectrometry runs of less than one hour. We present an optimized plasma proteomics workflow that enables high‐throughput with very short chromatographic gradients analyzing hundred samples per day with deep proteome coverage, especially when including a study‐specific spectral library generated by repeated injection and gas‐phase fractionation of pooled samples. Finally, we test the workflow on clinical biobank samples from malignant melanoma patients in immunotherapy to demonstrate the improved proteome coverage supporting the potential for future biomarker discovery.

acquisition using the mass spectrometer, precursor peptides are identified by database searching followed by assembly into the shortest list of proteins that can account for by the identified peptides. In parallel, the intensities of the quantified peptides are used to calculate intensities of the measured proteins, which corresponds to protein abundances in the sample [4].
The unbiased approach of the MS-based analysis has a great advantage over traditional immune-based assays since it in principle allows identification and quantification of all proteins present in a system -also known as the proteome. However, the hypothesis-free MS approach also requires laborious analytical workflows and interpretation of the resulting comprehensive data output to potentially discover new or unexpected biomarkers [5].
The main challenge of proteome analysis of plasma is the extremely high dynamic range of plasma proteins. Proteins of high abundance such as albumin, immunoglobulins, and coagulation factors comprise more than 75% of the total protein content in plasma with concentrations in the mg/ml range. Conversely, very low abundant signaling proteins such as endocrine hormones or cytokines, which are usually molecules of interest, are often only present the in pg/ml range. Consequently, the individual protein abundance spans up to ten orders of magnitude making the lower abundant proteins in raw plasma undetectable even to state-of-the-art mass spectrometers [5,6].
Different strategies have been investigated to address this issue, either through selective depletion or enrichment techniques prior to MS analysis. Columns to selectively deplete the most abundant plasma proteins have been developed and can effectively more than double the number of identified proteins, but besides specificity issue of protein removal kits, this will also remove a large group of proteins bound to albumin [7]. More recent approaches involve competitive binding of low-abundant proteins to nanoparticles with different surface chemistry [8]. With these approaches more than 4000 proteins can be quantified in human plasma but the workflows required is often strenuous and expensive [9].
Besides a multitude of soluble proteins, plasma is also an abundant source of extracellular vesicles (EVs) including apoptotic bodies (>1000 nm), microvesicles (MVs; 100-1000 nm), and exosomes (40-100 nm) [10]. Approximately 25%-50% of circulating EVs are platelets or platelet-derived with important functions for blood coagulation, but for the majority of EVs, the physiological function and importance is still not clear [11]. EVs have been linked to different biological signaling pathways and correlated to severity of different diseases including diabetes, and cancer, where they may be important for metastasis by supporting pre-metastatic niche formation [12,13]. EVs also play important roles within signaling and regulation of the immune system and have been linked to autoimmunity [14,15].
The growing interest in plasma EVs -particularly exosomeshas led to the development of enrichment techniques enabling subsequent MS-based proteomic analysis [16,17]. Exosome isolation through ultracentrifugation is time consuming and requires specialized centrifuge equipment that can reach >100.000 g making it inaccessible to many researchers and difficult to apply to large patient cohorts. As an alternative, larger EVs can be isolated at much lower

Significance Statement
Increasing the throughput and sensitivity of blood plasma proteomics is highly desirable for the unbiased discovery of protein biomarkers using mass spectrometry (MS). The study explores a fractionating approach using ultracentrifugation of plasma to selectively enrich extracellular vesicles.
We show that the approach significantly increases the number of proteins that can be quantified in plasma. The simple workflow enables analysis of undepleted plasma, preserving sample integrity, and is highly reproducible. Through optimizations, we are able to reach quantification of up to 4000 proteins from less than an hour liquid chromatography (LC) gradient and demonstrate that deep plasma proteomes can be achieved with even shorter gradients making it a scalable approach. Using patient samples, we demonstrate that EV enrichment from plasma may be used as a liquid biopsy to monitor protein regulation in response to cancer immune therapy. EVs are secreted by virtually all cell types and have many physiological functions, especially within immune signaling, but are overall poorly understood. In summary, the current work demonstrates a straightforward workflow that enables a whole new dimension in MS-based biomarker discovery in plasma and facilitates much needed research in EV biology.
centrifugation speeds without the need for expensive high-speed ultracentrifuges.
In this study, we introduce a fast and scalable sample preparation workflow to analyze the plasma proteome through enrichment of EVs > 100 nm by ultracentrifugation and data independent acquisition (DIA) based LC-MS/MS analysis using short single-shot LC gradients.

EV enrichment by ultracentrifugation
For isolation of the EV-enriched plasma fraction 250, 500, or 1000 µl PPP were supplemented with phosphate-buffered saline (PBS, cat #20012-019, Gibco) until equal volumes of 1000 µl. The samples were then centrifuged at 20 000 g at 4 • C for 30 min followed by removal of the supernatant leaving approximately 20 µl together with the EV pellet. PBS was supplemented until 1000 µl followed by a second centrifugation at 20 000 g for 30 min [18]. The pellet and a small volume (∼20 µl) of remaining supernatant -hereafter termed 20k EV fraction -were then subject to tryptic protein digestion.
Patient samples were frozen as platelet-rich plasma (PRP) in 500 µl aliquots and after thawing, the samples were centrifuged at 3000 g for 3 min to remove remaining platelets prior to EV enrichment as described above.

Protein digestion
The volume of all samples were equalized with Milli-Q water prior to digestion. For digestion of unfractionated plasma, 5 µl PPP plasma was used. To evaluate the optimal digestion method, the samples were

LC-MS/MS analysis
The resulting peptide concentrations were approximated with A280 measurements using NanoDrop (ThermoFisher Scientific) and the sam- In addition, a project specific spectral library was generated by injecting the same sample multiple times and applying gas phase fractionation in combination with DIA acquisition in a series of staggered narrow windows to fractionate precursors in a near DDA like mode [19].   Exposed films were developed in a Kodak Medical X-ray processor (Carestream Health) and scanned using a CanoScan 8800 scanner (Canon).

Data analysis
The collected RAW-files containing the MS data were analyzed using Area under the curve for each targeted ions were used for label-free quantification (LFQ) and data were normalized with the Spectronaut Cross-Run normalization feature. The patient sample data was normalized by the "normalizeQuantiles" function from the Bioconductor R package LIMMA [20]. The median of technical replicates were further normalized by median subtraction in the rows for each patient individually (±immunotherapy) prior to differential regulation testing. Volcano plots showing differential protein abundance upon immunotherapy in patient samples were generated by plotting the -log10 transformed and FDR-adjusted p values (FDR < 0.05) derived from a two-sided t-test versus log2-transformed fold changes as indicated.
Spectral libraries used for Spectronaut searches with spectral libraries were generated using DIANN v 1.8 searching against the same two FASTA files as mentioned above using DIANN's default settings, except for ticking-on "Generate spectral library," enabeling "Ox(M)" and setting Precursor charges to "2-5 Da," Neural network classifier to "Double pass mode," and Library generation to "Smart profiling" [21].
Proteomaps were generated for visual comparison of the plasma proteome with and without fractionation. Proteomaps show a visual representation of all quantified proteins by polygons proportional to their abundance and with a color code and grouping according to functional relationship [22]. The maps (Voronoi-plots) were generated by www.proteomaps.net using the untransformed protein intensities determined by Spectronaut and the associated Uniprot protein identifiers. Venn diagrams were made using gene names.

RESULTS
To overcome the dynamic range of protein abundances in plasma, we fractionated plasma samples by ultracentrifugation to enrich for EVs after prior removal of platelets by centrifugation at 3000 g. The resulting supernatant representing PPP was subjected to ultracentrifugation at 20,000 g for 30 min twice to pellet the EVs (20k EVs) ( Figure 1A) [18].
To find the best compromise between scalability and protein coverage, we optimized our workflow for high-throughput analysis of plasma proteomes by testing different sample preparation methods and LC-

MS/MS acquisition strategies including analysis of both unfractionated
and EV-enriched plasma ( Figure 1B). The initial analysis was performed using standardized LC-MS/MS parameters with a 21 min gradient and directDIA, analyzing samples prepared with the PAC digestion method [24].
We proceeded to optimize several aspects of the workflow cov-

Increased proteome depth with plasma fractionation
The processing of plasma into a 20k EV fraction led to consistent identifications of approximately 2200 protein groups across the entire workflow with three technical replicates performed from each of three biological replicates (Figure 2A; Supporting Information Table S1). The LC-MS/MS analyses of these samples were performed using 21 min gradients on an Evosep One HPLC system connected online to an Orbitrap Exploris 480 mass spectrometer [25]. The mass spectrometer was operated in DIA mode, and all raw MS files were processed with the directDIA workflow in Spectronaut without using a spectral library.
The number of proteins identified in the EV fractions was almost eightfold higher than the approximately 250 protein groups identified using the same analysis strategy for the unfractionated plasma samples. We were also able to increase the number of protein IDs in the unfractionated plasma sample to 360 protein groups simply by using the 20k EV datafiles as a library extension in Spectronaut.
We assessed the reproducibility of the quantification between technical LC-MS/MS injections and complete workflow replicates, and found that our workflow was highly reproducible within both the unfractionated plasma and the EV-enriched fraction with Pearson correlation coefficients in the range 0.97-0.99 for both technical and workflow replicates ( Figure 2B). The coefficient of variation (CV) was overall low but higher in the EV fraction with medians ranging from 12% to 16% intra-assay and 14% inter-assay variation to 7%-8% intra-assay, and 5% inter-assay variation in the unfractionated plasma ( Figure 2C).

Comparison of the 20k EV fraction proteome to unfractionated plasma
To compare the quantitative and qualitative differences between the fractionated and unfractionated plasma proteome, we visualized their proteome compositions using Voronoi diagrams ( Figure 3A). with plasma fractionation allowed identification and quantification of six to eight times more proteins than from unfractionated plasma with almost all protein groups identified across three independent WF replicates. The median of three separate workflows with three technical replicates is shown on top of each bar. The co-analysis in Spectronaut combining DIA data from unfractionated plasma with the 20k EV fraction allowed us to identify 50% more proteins from the same samples. (B) A correlation analysis between technical and workflow replicates showed that the quantification were highly reproducible with Pearson correlation coefficients of 0.97-0.99. (C) Both intra-assay CVs between three technical replicates within three separate WF replicates and the inter-analytical CV across three separate workflows were low. CV, coefficient of variation; DIA, data independent acquisition; EV, extracellular vesicle; WF, workflow To further evaluate the selective enrichment of certain proteins in the EV fraction, we compared the protein LFQ intensity of 239 protein groups that were identified in both 20k EV and unfractionated plasma ( Figure 3D). We found that 23 proteins were enriched more than 10-fold in relative abundance in the 20k EV fraction compared to unfractionated plasma and 11 proteins were highly enriched (> 100fold) (Supporting Information Table S2). The enriched proteins included intracellular proteins such as talin-1, actins, cytoskeletal keratins, and many proteins generally related to exosomes. Nine of the 11 highly enriched proteins were also found in the Exocarta Top100 Protein list of exosomal proteins. Lipoprotein particles are a potential source of contamination to the 20k EV fraction, and to quantify their relative contribution to the 20k EV fraction, we compared the intensities of 29 lipoproteins shared between the 20k EVs and unfractionated plasma.
We found a strong linear correlation (R 2 = 0.92) indicating that the 20k EV fraction was not enriched for lipoprotein particles (Supporting Information Figure S1).
To benchmark the quantitative accuracy of our plasma proteomics workflow, we compared the relative quantification of proteins in our sample with reference plasma concentrations measured by immunoassays as this is considered the gold standard for quantification of proteins in plasma. When comparing to a list of 365 reference protein concentrations measured with immunoassays, we found an overlap with 87 and 113 proteins in our dataset for unfractionated plasma and 20k EVs, respectively [28]. Although these values are prone to high interindividual heterogeneity, the correlation between the immunoassaybased plasma concentration and the LFQ intensity was high. We found Pearson correlation coefficients of 0.99 in the unfractionated plasma and 0.94 in the 20k EV fraction supporting high relative accuracy of the LC-MS/MS quantification also in fractionated samples ( Figure 3E,F).
To further characterize the 20k EV fraction, NTA was applied to analyze EV size distribution profiles in the different plasma fractions and this demonstrated an enrichment of larger EVs in the 20k EV fraction with a median size of 155 nm compared to 98 nm in unfractionated plasma ( Figure 3G). The size and presence of large EVs were also confirmed by analyzing the 20k EV fraction by cryo-transmission electron microscopy ( Figure 3H). Finally, western blotting was applied as an orthogonal technique to MS to validate the presence and enrichment of exosomal marker proteins (CD9, Flotilin-1, CD54) and intracellular proteins (ACTB) in the 20k EV fraction ( Figure 2I and Supporting Information Figure S2).

Optimization of fractionation and proteomics workflows
The necessary starting volume of plasma samples is important to consider as it will correlate with the yield of 20k EVs. But requirements of large plasma volumes could limit the use of patient biobank material, and to find the optimal starting volume of plasma required for fractionation, we evaluated our protocol using three different starting volumes.
We found that our protocol performed equally well with 500 µl plasma compared to 1000 µl in terms of numbers of identified protein groups but that starting with only 250 µl resulted in ∼20% less identifications and more variable results with less replication. This leads to the conclusion that 500 µl plasma is the smallest optimal volume for our protocol ( Figure 4A). Secondly, we assessed three different popular sample digestion workflows for plasma proteomics to identify the best workflow suited for this type of biological material. In terms of unique peptides and identified protein groups, we found in-solution digestion in urea containing buffer and PAC on-bead digestion to be comparable, but with in-solution digestion resulting in slightly more unique peptides and PAC digestion resulting in slightly more protein groups ( Figure 4B and Supporting Information Figure S3B). The protein sequence coverage was comparable between PAC and in-solution digestion, but with more missed cleavages with in-solution digestion suggesting a more incomplete digestion in line with previous observations (Figure 4C and Supporting Information Figure S3C) [24].
Next, we wished to test the performance of the LC-MS/MS platform with increasing LC gradient lengths and different acquisition modes ( Figure 4D and Supporting Information Figure S3D). We found that the number of protein and peptide identifications were lower in DDA mode compared to DIA with all gradients -especially when accounting for the reproducibility between technical replicates highlighting the superiority of DIA in combination with LC gradients of less than one hour [29].
Using the directDIA computational analysis workflow without a spectral library, we found that the 45 min gradient in DIA mode led to identification of the highest number of peptides and proteins closely followed by the 21 min gradient. Interestingly, when analyzing the same DIA data using a project-specific gas-phase fractionated spectral library, we could increase the number of identified protein groups by more than 50% -identifying additional 1400 proteins (3964 proteins in total) with the 45 min gradient. We observed similar increases in protein identifications for the 12 and 21 min gradients but with a less pronounced effect in the latter.
To find the best balance between MS measurement time and resulting protein identifications, we also evaluated the number of proteins identified per LC gradient minute. We found here that the 12 min gradient was by far the most efficient in identifying proteins in both unfractionated and fractionated plasma samples ( Figure 4E,F). Here, the DIA data acquisition allowed identification of 131 unique protein groups per minute in the 20k EV fraction, which could be increased to 237 proteins per minute when using a spectral library. Respectively, the 21 and 45 min methods identified 92 and 56 proteins per gradient minute and this could be increased to 131 and 88 proteins per minute with a spectral library.

Plasma proteomics with EV-enrichment for studying immune-related responses to cancer immune therapy
EVs are important mediators of immune signaling, and to explore the potential of our optimized workflow, we hypothesized that it could be Patient samples were frozen as PRP, and to minimize contamination of platelet-derived proteins and EVs, we removed platelets through an initial centrifugation step prior to 20k EV fractionation. Across samples, we were able to identify more than 2500 protein groups in the 20k EV fraction compared to 380 proteins in the unfractionated plasma ( Figure 5A; Supporting Information Table S3). Notably, the EV-enrichment tripled the number of detectable proteins associated with the Gene Ontology (GO) term immune response process compared to unfractionated plasma samples ( Figure 5B). Furthermore, Despite our small sample size, we observed regulation of multiple proteins associated with the immune system. As expected, the detectable number of regulated proteins in the unfractionated plasma was low, but included proteins of the S100 family (S100A9 and S100A12) associated with malignant melanoma disease progression [30]. In the 20k EV fraction, the observed immune regulation was much broader and included proteins associated with cytokine signaling (JAK2 and TNFAIP8L2) and potential biomarkers for malignant melanoma (LGALS3BP, FCN2, RAB22N) [31][32][33][34][35].

DISCUSSION
Through enrichment of EVs and workflow optimization, we are able to routinely achieve a plasma proteome depth of more than two thousand proteins -reaching up to 3800 proteins in single runs. This far exceeds what is currently accessible for single shot LC-MS/MS of unfractionated plasma as our own results with the unfractionated samples also demonstrate. Notably, our results from unfractionated plasma are comparable to current state-of-the-art single-shot proteomics of undepleted plasma underlining the need for strategies to reduce the dynamic range if low abundant proteins are to be quantified [5,36].
The relatively simple sample preparation protocol based on lower range ultracentrifugation combined with the high performance in DIA mode, even when using 12 min LC gradients, makes the 20k EV fraction attractive for high throughput analyses as up to a hundred samples can be analyzed per day using this strategy. Even higher analysis depth can be reached using a 45 min LC gradients in combination with a projectspecific spectral library but at the cost of throughput.
Enrichment of EVs is one way to overcome the dynamic range issue of undepleted human plasma, which will remain challenging for LC-MS/MS systems in the foreseeable future. Recently, nanoparticles with differential binding affinities have been introduced as an alternative Significantly regulated proteins with >50% change after therapy and p < 0.05 are highlighted and those related to the GO term immune system processes (GO:0002376) are annotated in red. GO, Gene Ontology; EVs, extracellular vesicles approach to selectively enrich for lower abundant plasma proteins [9].
By this approach, up to 2500 proteins can be quantified in undepleted human plasma but, it remains costly and is -in contrast to our approach -limited by the necessity of proprietary instruments.
A potential limitation of our approach is the unclear biological significance of EVs in plasma. Our results also show that platelet-derived vesicles are likely to make up a significant proportion of the 20k EV fraction. Furthermore, the fraction of platelet-derived EVs is likely to be variable and affected by pre-analytical variability during sample handling and unintended platelet activation, which will cause platelets to shed EVs [37]. Thus, depending on the aim of the analysis, the proportion of platelet-derived EVs and the pre-analytical variance should be taken into account. Still, accumulating evidence points toward multiple important biological functions of EVs, and new methods to study EVs may further expand this field of research.
To evaluate a potential application for our deep plasma proteomics workflow in a clinical setting, we included a preliminary analysis of blood samples from 3 patients with metastatic melanoma receiving cancer immune therapy. The results show that the workflow is applicable to biobank samples, and that 20k EV fractionation markedly increases the detectable number of proteins. Importantly, the number of regulated immune-related proteins were tripled and went beyond the high-abundant complement and immunoglobulin proteins primarily found in the unfractionated plasma -demonstrating a potential for immune monitoring with 20k EV profiling and biomarker discovery in larger patient cohorts.

CONCLUSION
Enrichment of EVs is a simple but powerful approach to facilitate comprehensive analysis of the human plasma proteome that is otherwise inaccessible by LC-MS/MS analysis due to the high dynamic range of protein abundances. We demonstrate in the present work that this approach far exceeds unfractionated plasma in terms of proteome depth and that the results are comparable with expensive cutting-edge techniques. Further, this method is likely to uncover more of the biological importance of EVs in human plasma and could lead to the discovery of new biomarkers.

ACKNOWLEDGMENTS
The work is carried out as a part of the BRIDGE -Translational Excel-