Classification of single extracellular vesicles in a double nanohole optical tweezer for cancer detection

A major challenge in cancer prognostics is finding early biomarkers that can accurately identify cancer. Circulating tumor cells are rare and circulating tumor DNA can not provide information about the originating cell. Extracellular vesicles (EVs) contain cell specific information, are abundant in fluids, and have unique properties between cancerous and non-cancerous. Fluorescence measurements have limitations from intrinsic fluorescent background signals, photobleaching, non-specific labelling, and EV structural modifications. Here, we demonstrate a label-free approach to classification of 3 different EVs, derived from non-malignant, non-invasive cancerous, and invasive cancerous cell lines. Using double nanohole optical tweezers, the scattering from single trapped EVs is measured, and using a 1D convolutional neural network, we are able to classify the time series optical signal into its respective EV class with greater than 90% accuracy.


Introduction
Cancer diagnostics, monitoring, and prognostics are rapidly growing fields and biomarkers present in blood such as circulating tumor cells (CTC), circulating tumor DNA (ctDNA), and extracellular vesicles (EVs) provide a promising avenue for early-stage detection [1].CTCs present in blood are rare, posing challenges to early detection [2,3] and ctDNA is unable to be used for identification of the originating cell.In contrast, EVs provide good diagnostic capabilities as they contain lipids, proteins, and nucleic acids from the parent cell, enclosed within a lipid bilayer.Released by all cells through exocytosis (exosome; 30-150 nm) or budding and pinching off (ectosome; ∼0.1-1 µm), EVs are stable and plentiful in biological fluids (2-6 × 10 10 EVs/ml).Due to their different cellular origin and formation, exosomes and ectosomes provide different information about the cell.The biophysical properties of cancer-derived EVs are unique: the lipid composition, nucleic acid and protein content, and membrane rigidity are altered compared to healthy cell-derived EVs.Exosomes originate from the endocytic pathway and are enriched in endosomal components like tetraspanins (CD63, CD81), ALIX, TSG101, and endosome-associated lipids like cholesterol and ceramide [4].Exosomal cargo composition provides information about the endosomal sorting and trafficking pathways active in the originating cell [5,6].Exosomes from cancer cells are enriched in proteins and RNAs involved in tumour progression, like oncoproteins [7], pro-metastatic factors [8], and immunosuppressive molecules [9].Ectosomes, on the other hand, are plasma membrane-derived [10,11] and enriched in plasma membrane proteins like integrins, selectins, and metalloproteinases [6,12].Their cargo reflects the state of the plasma membrane and surface molecules of the parent cell at the time of release [5,6].Ectosomes from metastatic tumour cells are enriched in matrix metalloproteinases [13] and integrins [14] that facilitate invasion and metastasis.These suggest that EV liquid biopsies present an advantage over other approaches [15][16][17][18].
One of the biggest challenges in cancer detection is finding early biomarkers, requiring disruptive technologies that can accurately identify the early cancer biomarkers [1].Labelling is common in EV cancer analysis with staining or fluorophore bonding being common techniques.Fluorescent measurements have limitations from intrinsic fluorescent background signals, photobleaching, non-specific labelling, and EV structural modifications [19].
However, EVs are released by all cells and thus the population of EVs in a patient will be non-specific to cancer.Therefore, a method of interrogating single exosomes to distinguish between cancerous and non-cancerous is imperative.Given the broad size range of EVs, isolating exosomes from ectosomes becomes important for cancer detection.Current methods for isolating exosomes include ultracentrifugation [20], ultrafiltration [21], immunoaffinity capture [22], precipitation [23], or microfluidic based methods [24], but have low reproducibility and low throughput [25].Since it is difficult to separate exosomes and ectosomes solely based on their size, we used the general word EVs to refer to the small (>200 nm) sub-cellular particles isolated via 100 000 g spin on an ultracentrifuge; as it is recommended by MISEV 2023 (the latest research guideline in EV studies) [26].A preferred detection platform would be able to work with solution containing ectosomes and exosomes while providing accurate cancer detection.
Single EV analysis is desired to eliminate the background from healthy cell-derived EVs, which has led other researchers to consider laser tweezers to trap single EVs and take their Raman spectrum [27][28][29].This approach requires large laser powers damaging the EVs, low gradient forces inhibiting trapping of exosomes and small ectosomes, and weak Raman signals requiring lengthy integration times.Signal from single EVs can also be acquired using plasmonic antennas [30][31][32] and surface plasmon resonance imaging [33][34][35].
We report low-power, high-throughput optical trapping of single EVs using mass fabricated DNHs.This approach is label-free, using the optical signal and machine learning to accurately classify EVs from three different cell lines, non-malignant, non-invasive cancerous, and invasive cancerous.While greater accuracy will be required for clinical applications, this is a promising first step towards label-free diagnostics.

DNH optical trapping
The DNH provides an enhanced gradient force allowing for more stable trapping at lower powers compared to optical tweezers, as well as introducing a spatial confinement that reduces the effective trapping volume.Figure 1(a) shows representative trapping events for each of the EVs: MCF10A (non-malignant), MCF7 (non-invasive, cancerous), and MDA-MB-231 (invasive, cancerous).The number of trapping events for each EV sample are MCF10A (134), MCF7 (107), and MDA-MB-231 (76).Measurements were repeated over multiple days, on multiple DNHs, and from multiple EV isolations.A random selection of 24 traps of each EVs is shown in figures S1-3.Qualitative differences are readily seen in the trapping data, for example, MDA-MB-231 regularly showed a stronger initial step when trapped.There are also differences observed in the noise amplitude and power spectral density which give information about the size, refractive index (RI), and shape of the EV.A probability density function is fit to the 10 Hz low pass filtered signal showing the amplitude of the signal as well as a qualitative representation of the change in transmission through the aperture during trapping.The power spectral density with a Lorentzian fit is also shown for each signal, providing information about the trapping stiffness, which relates to the size of the trapped EV [54].While the corner frequency does contain information about the size, shape, and RI of the particle, it is not used for classification here [55].There are no statistically observed size differences between EVs, as can be seen in figures S5(b) and S8, as well as table S1.A simplified diagram of the optical tweezer is shown in figure 1(b), a full review of the method may be found in past works [56][57][58].The change in transmission through the nanoaperture for each line of EVs is shown in figure 1(c).Due to dielectric loading of the nanoaperture, the transmission through the nanoaperture changes with a magnitude depending on the RI of the biomolecule.EVs are typically characterized as having an RI less than 1.42 [59,60], which is close to the RI of PBS (1.34), leading to minimal changes in transmission.However, we show that DNH optical tweezers are sensitive enough to observe differences between EV cell lines.The percent change in transmission for MCF10A, MCF7, and MDA-MB-231 is highlighted in table 1 below and one way analysis of variance results in table S2 show the statistical significance of the results.While the p value (2.2 × 10 −7 ) indicates a strong rejection of the null hypothesis (averages of the percent change in transmission are the same), it must be noted that the variance in MCF10A and MCF7 averages show significant overlap and cannot be used as a classification scheme.
Previous studies have suggested that the surface charge of trapped EVs may influence the transmission signal [61][62][63], with evidence indicating varying zeta potentials between cancerous and non-cancerous EVs [64,65].Of particular interest is the ease of trapping, with the time between turning on the laser and a single EV being trapped (latency) taking a few seconds.Stable trapping is achieved at low power (3.7 mW) to minimize sample heating and potential damage to the EV, but the laser power was increased to 9.7 mW for tens of seconds and there was no observed bursting of the trapped EV, allowing for higher strength trapping in the future.Considering a spherical particle of radius 35 nm, the diffusion coefficient in water is 6.3 × 10 −12 m 2 s −1 .It requires only 8 s for the EV to diffuse 10 µm.The laser was kept off for a minimum of 60 s to allow ample time for the EV to diffuse away and minimize the chance of trapping the same EV.To verify EVs were not damaged during trapping, the laser was turned off after trapping to release the EV and videos of the trapping have been added to the database (see data availability).The high thermal conductivity of the gold and low absorption at 980 nm tend the DNH to minimal heating Past works have confirmed that the EV is inside of the gap when trapped [53], adding an inherent upper limit to the size of the trapped EV. Figure S4 shows the average gap size is 116 nm, allowing for trapping of single exosomes and small ectosomes in mixed solutions containing small and large vesicles.A nanoparticle tracking analysis (NTA) intensity vs size plot is shown in figure S5(a).An root-mean-squared-deviation (RMSD) vs corner frequency plot is  [70][71][72][73][74][75].While traditional analysis methods (RMSD, PSD) are unable to capture the differences between the EVs, the CNN is able to extract features that are not readily apparent.Figure 2(a) shows a conceptual overview of the CNN model.The MATLAB function 'trainNetwork' was used for creating the neural network.The data sets were built to ensure a common signal length for training and an equal amount of trapping events for each EV (50 each).Each file of trapped data was split into one second segments, preserving the temporality of the signal.The segments were downsampled from 100 kHz to 500 Hz to accentuate the main features of the signals.In addition to avoiding bias from signal length, segmenting the data works to the advantage of 1D CNNs ability to learn from short-term data to identify long-term temporal features.The segmented frames were then divided into training, validation, and testing with a 50%, 30%, and 20% ratio.The number of segments used for training, validation, and testing were 1220, 732, and 488.Table 2 shows the layers used in the CNN training and their respective sizes.Figure 2(b) shows the confusion matrix of our trained model with the precision, recall and F1 score of each EV shown in table 3. Overfitting of the model is unlikely for two reasons.The model performs well on data that was kept out of the training and the learning curves seen in figure S9 show that the model stops learning before the validation and test performance get worse.Repeated training with the same data split (50%, 30%, 20%) but different segments in each grouping gave close results to the model shown here, within a 1% difference in accuracy.Precision is the ratio between the true positive classifications and all of the positive classifications TP TP+FP , or the measure of how many classifications were true out of all EV classifications.Recall is the measure of the model correctly identifying true positives TP TP+FN , in this case the recall score tells how many EVs were correctly classified.The F1 score provides a balanced assessment between precision and recall, defined as 2×precision×recall precision+recall , thereby measuring how effectively the model makes the trade-off between precision and recall.The overall test accuracy from the initial model split was 88.52%.Four files from each EV dataset were removed from the training datasets for external testing, but otherwise underwent the same segmenting and filtering process as the training sets, thus there were 205 segmented data sets tested.Additional testing was performed with data kept out of the initial model split was performed with accuracies of 93.33%, 86.44%, 100.00%, and 94.12%, and the confusion matrices can be found in figure S10.Each test was performed with segments from each EV type.

EVs
After EVs from conditioned media via ultra-centrifugation, EVs were characterized by western blot for EV markers, CD9 and CD63 and cell lysate contaminant, calnexin.Western blot showed that CD9 and CD63 are abundant in EVs relative to the cell lysate, seen in figure S6.Nanoscale flow cytometry can be employed for detection of EVs between 80 nm to 1000 nm at single particle resolution [76][77][78].Here, isolated EVs were analyzed for CD63 by nanoscale flow cytometry using 405 nm violet side scatter trigger and showed that CD63 is abundant on the surface of EVs, seen in figure S7.The NTA sizing data is shown in figure S8 and table S1.

Discussion
Past works using atomic force microscopy and machine learning have shown mechanical variations of EVs that can be used to predict parent cells [79].It is possible that these variations are also responsible for the fluctuations that allow for classification in our experiments.While the size of the EV can influence the optical signal, figure S5(b) shows that there is significant size overlap between all of the EVs so this cannot be used for classification alone.
The levels of cancer cell-derived EVs in humans is low in comparison to healthy EVs.One study shows a level of 23-1900 cancer-derived EVs per ml of blood per 1 mm 3 of tumour [80].This suggests that a substantial increase in cancerous EV concentration would not be observable until the tumor exceeds a size larger than 50 cm 3 , making early stage detection difficult.The achieved specificity in identifying whether a trapped EV is cancerous or non-cancerous is 98.06%, whereas the specificity of classifying the trapped EV as either MCF10A, MCF7, or MDA-MB-231 is 96.5%, 97.9%, and 85.3%.Clinical diagnostics requires a high specificity (>99%), however current on market exosome diagnostics have achieved 89% [81].With improvement in the accuracy of our model, the next steps involve trapping EVs using a mixed solution comprising a small amount of cancerous EVs and non-malignant EVs.Then, using the trained CNN model, classification of the trapped signal can be performed to determine the ratio of cancerous to non-cancerous EVs, and detection capabilities at low concentration.Improvements in classification may come from a multi layer network or exploring other models.DNH optical tweezers have shown precedent working in 'dirty solutions' with capabilities to distinguish between different proteins [82].
Compared to prior NOT experiments involving other biomolecules (DNA, proteins, peptides), and nanoparticles (quantum dots, upconverting nanoparticles, nanocrystals), the throughput of EV trapping is 20x faster (on the order of seconds) at 10x lower concentration (0.1 ug/ul).Throughput is currently limited to human factors involving fine alignment of the stage, laser control, and data organization between trapping events, all of which could be automated in the future for even higher throughput.Various factors affect the ability to trap including thermophoresis [83].In the case of EVs, it is found that they are less thermophobic than proteins, which would lead to faster trapping, especially considering the lower induced temperature from the low per trapping [84].It is possible to 'trap' an EV larger than the DNH gap; however, qualitatively this was a rare event and is characterized by a distinct slow decrease in APD signal and a large change on the camera.Thus it is highly likely only EVs smaller than the DNH that are trapped.While a fluidics system was not necessary to trap the EVs, integration of a fluidic system could further improve the time-to-trap and enable additional functionality for studying EVs in different solutions after trapping, or for sorting EVs after trapping.To apply this platform for clinical work, the throughput needs to be scaled extraordinarily, to the level of 10 million traps.As it stands now, trapping and exporting the data takes prohibitively long to trap more than a few hundred in 8 hours.Possible ways to increase the throughput is trapping with arrays of nanostructured fiber tip trapping, automation, or microfluidic integration.An added benefit of the DNH platform is the ability to isolate specific EVs for further characterization elsewhere, such as electron microscopy [53], or sorting using a combined nanopore/optical tweezer [45].Isolating specific EVs allows for analysis of the contents of the EV (proteins and nucleic acids), for determining their origins, or prognostics.

Isolation of EVs
To perform EV isolation using ultracentrifuge, 5×10 6 cells were seeded in 15 cm plates in 15 ml 10% EV-depleted FBS supplemented DMEM media for each condition.After 48 hours, conditioned media was harvested and spun at 300 * g for 10 min, at 4 • C to remove dead and floating cells, then at 2500 * g for 30 min, at 4 • C, to remove cell debris and apoptotic bodies.Next, the supernatant was transferred to polyallomer 38.5 ml open-top ultracentrifuge tubes (Beckman Coulter) and was centrifuged at 10 000 * g for 30 min, at 4 • C, to isolate large EVs.A SW32Ti Rotor (Beckman Coulter) was used in an Optima XPN-100 ultracentrifuge (Beckman Coulter).The supernatant was transferred to new ultracentrifuge tubes and ultracentrifuged at 100 000 * g for 70 min, at 4 • C, using the same ultracentrifuge tubes and rotor as above.The resulting pellets were washed by resuspension in 0.02 µm filtered Phosphate Buffered Salin (PBS) (MultiCell), and re-pelleted in the same conditions as above.EV pellets were resuspended in 30 µl 0.02 µm filtered PBS and freshly used.

Flow cytometry analysis of EVs
After isolating the EVs via ultracentrifuge, 10 µl of freshly isolated EVs was placed in a 1.5 ml tube and incubated with 1.1 µg CD63-FITC (abcam #ab18235) and mouse-IgG1 isotype control (abcam #ab91356) antibody for 30 min in the dark at room temperature.Then, the EV samples were diluted 1:10 in 90 µl 0.02 µm filtered PBS (MultiCell) and transferred into a flat bottomed 96 well plate.The analysis was conducted on a CytoFLEX S flow cytometer (Beckman) utilizing MilliQ water (ultrapure filtered and deionized water from the MilliQ water purification system) as the diluent in place of sheath fluid.The instrument was configured with the following parameters: triggering on 405 nm violet side scatter, violet side scatter detection set to 1027, and a slow speed setting for 60 s (analyzing 10 µl of sample per minute).Manual gating was employed to isolate populations of interest, referencing isotype controls.Gain settings were adjusted as follows: VSSC: 100, and FITC: 500.

NTA of EVs
Isolated EV obtained from three cell lines, MCF10A, MCF7 and MDA-MB-231, underwent a dilution of 5:1000 in 0.02 µm filtered Phosphate Buffered Saline (PBS).Subsequently, the NanoSight LM10 equipped with a blue 488 nm laser (Malvern Panalytical) facilitated the assessment of the quantity and dimensions of the diluted EVs.To ensure accurate measurements, three 30 s readings were captured employing a syringe pump to sustain EV movement within the chamber at a velocity of 40 µl min −1 .The camera level was adjusted to level 14.A consistent detection threshold of 5 (AU) was maintained throughout the analysis process which determines the minimum intensity for a scattered light signal to be detected as a real signal [85].Data analysis was performed using NTA software version 3.4.

Western blot
After harvesting the conditioned media for EV isolation, the remaining cells were washed with PBS twice, and then 200 µl of ice-cold RIPA buffer (Thermo Scientific #PI89900) was added to each plate.After scraping the cells with a cell scraper, cells were collected in a 1.5 ml tube and agitated on a rotating shaker for 10 min at 4 • C.Then, the tube was centrifuged at 16×10 6 for 10 min at 4 • C.After collecting the supernatant and discarding the pellet, the protein concentration of the cell lysate and the isolated EVs were measured using Pierce™ Rapid Gold BCA Protein Assay Kit and Micro BCA™ Protein Assay Kit (Thermo Fisher Scientific), respectively, according to the manufacturer's instructions.Following the protein concentration quantification, 13 µg of protein was combined with 25% loading buffer (Novex Life Technologies), and Milli-Q water was added to reach the final volume of 40 µL.The samples underwent boiling at 95 • C for 10 min and were subsequently loaded onto Bolt 4%-12% Bis-Tris Plus gradient gels (Thermo Fisher Scientific).Gel electrophoresis was performed at 200 V for 30 min using MES (50 mM MES (Sigma), 50 mM Tris Base, 0.1% SDS, 1 mM EDTA, pH 7.7) running buffer.All reagents were obtained from Fisher Bioreagents unless otherwise specified.Following gel electrophoresis, proteins were transferred onto nitrocellulose 0.45 µm membrane (BioRad) through wet transfer (190 mM glycine, 25 mM Tris Base).Post-transfer, membranes were blocked in 5% milk in TBS-T (20 mM Tris base, 160 mM NaCl, 0.1% Tween) for 1 h.Primary antibodies were applied overnight at 4 • C in 1% milk at the dilutions as follows: CD9 (EMD Millipore #CBL162) 1:1000 µl CD63 (BD Pharmingen #556 019) 1:1000 µl, Calnexin (Abcam #AB22595) 1:5000 µl, β-actin (Sigma #00 001 20 485) 1:2000 µl.Membranes underwent three washes of 10 min each in TBS-T.Secondary Licor IRDye 680RD and IRDye 800CW antibodies were applied in 1% milk for 1 h in the dark, followed by repeating the washing steps.Membranes were imaged on the Licor Odyssey®CLx following the manufacturer's guidelines, employing Image Studio Lite software (5.2).

Colloidal lithography of DNHs
DNHs were fabricated using a modified approach from past works [86].Standard microscope slides (Fisherbrand 12-550C, 75×50×1.0mm) were cut into thirds using a diamond scribe, cleaned by sonicating in ethanol for ten minutes, and dried under nitrogen.A solution was made of 10 µl of 600 nm polystyrene beads (Alpha Nanotech) and 1 ml of ethanol.Using 10 µl of the polystyrene-ethanol solution, a zig-zag pattern was deposited on the microscope slides and left overnight.Once the ethanol evaporated, the coated slides were covered with 7 nm of titanium and 70 nm of gold using a sputter deposition system (MANTIS QUBE).Polystyrene was removed from the metal using tape lift off, plasma cleaned for ten minutes, and stored in ultrapure water.We currently produce 36 chips per fabrication run, but this approach is highly scalable.

Trapping solution
A 24×60-1 microscope cover glass (globe Scientific Inc.) was cleaned using isopropyl alcohol and dried under nitrogen.An image spacer (Grace Bio-Labs GBL-654 008-100EA) was placed in the center of the coverslip and 9.37 µl of the solution was pipetted into the spacer.The gold DNH sample was placed on top of the spacer so the gold is in contact with the solution.EVs were diluted in PBS to a concentration of 0.112 µg/µl (MCF10A), 0.494 µg/µl (MCF7), and 0.191 µg/µl (MDA-MB-231).6.7.Software, statistical analysis, and data acquisition 6.7.1.Nanoaperture optical tweezers All data analysis was performed using custom python code.Data was collected at a sampling rate of 100 kS/s.Total number of trapping events analyzed is 317.The power spectral density was calculated using a 5 s window of the trapped signal and then a boxcar average is taken with a weight of 10.Lorentzian function of the form , where f c is the corner frequency and is fit to the boxcar average PSD.The RMSD is calculated by dividing the same 5 s portion of the trapped signal into sections based on a window length of 5000.The RMSD is then divided by the mean of the trapped signal and an average of all the sections is taken.To normalize the values, an average of all laser RMSD's was subtracted from each trapped RMSD.

CNN
MATLAB convolution1dLayer was used as the platform for sequence classification.A filter size of 5, 32 filters, 1 feature, 3 classes, mini batch size of 5, 200 epochs, and learning rate of 0.001 was used for the training.The files are split between training and testing and split into smaller time segments (1 s).Additional files were kept outside of the model generation and tested afterwards.The optical signal was normalized to remove the DC offset due to day-to-day differences and small aperture variations as well as to enhance the features present in the data.The normalization was of the form Voltage = 10 * (Voltage-mean(Voltage)) + 2 * (mean(Voltage)), this allows for amplification of the fluctuation features in the data and the amplitude of the signal.The normalized data was then downsampled to 500 Hz from 100 kHz and all files were ensured to have the same length.

Conclusion
We have shown that DNH optical tweezers are capable of easily trapping EVs without damaging them.Differences in EVs are identified through changes in transmission related to the dielectric properties, but using a CNN to identify less readily apparent features, accuracy greater than 85% is achieved in classifying trapped EVs into specific cell lines.This lays the groundwork for future work with spiked samples of low concentration cancerous EVs, and with improvements in throughput, clinical applications.One potential future avenue would be to investigate CNNs on dielectric nanoparticles of various composition, size, and shape to obtain some physical relation to the CNN parameters.

Figure 1 .
Figure 1.(a) Representative trapping signal, probability density function, and power spectral density boxcar average with Lorentzian fit for each line of extracellular vesicles.(b) Simplified diagram of nanoaperture optical trapping setup with a scanning electron microscope image of a DNH and illustration of an exosome being trapped.(c) Change in transmission percentage through the DNH for each line of extracellular vesicles.

Figure 2 .
Figure 2. (a) Conceptual overview the CNN training and classification of optical signal.(b) Confusion matrix of CNN model with a training/validation/testing split of 50%, 30%, and 20%.

Table 1 .
[29,[66][67][68][69]ansmission count, average, and variances for each EV. , showing a similar size/intensity trend, but no quantitative values are able to be obtained at this point.The size range of EVs in the solution was obtained from NTA and is shown if figure S8. 1D convolutional neural network (CNN) was used to classify the trapping signals.The full data sets and code are published, see data availability below.CNNs have precedent in classifying optical tweezers signals, but are largely used in image based deep learning[29,[66][67][68][69].We applied a simple 1D CNN architecture to classify the transmitted optical signal of EVs.Typically, deep learning models for computer vision have achieved tremendous success using feature learning.Unlike these models, CNNs do not require features to be A identified a priori to model training: feature learning and classification are jointly learned as a result of multiple layers in the model (pooling, convolution, normalization, etc).1D CNNs are useful for taking short-term temporal features and identifying long-term patterns.Initial layers of CNNs identify simple features such as peaks and dips in time series data or edges in images using a Gabor-like filter.Further processing in the CNN layers identifies increasingly complex patterns until the final layer where discriminative features can be identified.Compared to a deep CNN, where the number of training parameters is given by w × h × c, where w and h are the width and height of the filter and c is the depth of the convolutional filter (including time dimensions), the 1D CNN has only c training parameters allowing for better results when training with smaller data sets.Past studies have shown 1D CNNs to have better performance for applications with limited labeled data and high signal variance

Table 3 .
CNN training results for each EV.