Machine learning reveals complex behaviours in optically trapped particles

Since their invention in the 1980s [1], optical tweezers have found a wide range of applications, from biophotonics and mechanobiology to microscopy and optomechanics [2, 3, 4, 5]. Simulations of the motion of microscopic particles held by optical tweezers are often required to explore complex phenomena and to interpret experimental data [6, 7, 8, 9]. For the sake of computational efficiency, these simulations usually model the optical tweezers as an harmonic potential [10, 11]. However, more physically-accurate optical-scattering models [12, 13, 14, 15] are required to accurately model more onerous systems; this is especially true for optical traps generated with complex fields [16, 17, 18, 19]. Although accurate, these models tend to be prohibitively slow for problems with more than one or two degrees of freedom (DoF) [20], which has limited their broad adoption. Here, we demonstrate that machine learning permits one to combine the speed of the harmonic model with the accuracy of optical-scattering models. Specifically, we show that a neural network can be trained to rapidly and accurately predict the optical forces acting on a microscopic particle. We demonstrate the utility of this approach on two phenomena that are prohibitively slow to accurately simulate otherwise: the escape dynamics of swelling microparticles in an optical trap, and the rotation rates of particles in a superposition of beams with opposite orbital angular momenta. Thanks to its high speed and accuracy, this method can greatly enhance the range of phenomena that can be efficiently simulated and studied.


Inputs
Hidden Layers Outputs phenomena [26,27], it is necessary to employ semi-analytical methods such as the T-matrix method to obtain a physically accurate simulation [12,14,13,15]. However, these semi-analytical methods are prohibitively slow for running extensive dynamics simulations [20].
An alternative approach is to use local interpolation, which involves sampling the force distribution at a number of discrete points on either a grid (structured interpolation) or at random locations (unstructured interpolation) and using an interpolation function (e.g., a linear or cubic polynomial) to estimate intermediate values [28]. Unlike Brownian dynamics simulations, which require sequential force calculations, the force values required for interpolation can be calculated in parallel, significantly reducing run-time. While interpolation works well for low-dimensional problems (1 to 3 DoF) [28], it is difficult to implement efficiently for higher-dimensional problems because of runaway requirements on the memory storage and the lack of ready algorithms implemented in the major software packages.
In this Letter, we demonstrate an alternative method based on machine learning. We show that a neural network (NN) can be trained to calculate optical forces with greater accuracy than phenomenological methods, in significantly less time than exact methods, and with less memory requirements than interpolation methods. To demonstrate the power of this approach, we employ it to study two phenomena that would be prohibitively slow to accurately simulate otherwise: the escape dynamics of swelling micro-particles in an optical trap, and the hopping rates of particles in a superposition of Laguerre-Gaussian beams.
While standard methods require explicit formulas to determine the optical forces starting from the physical parameters of the particle and trapping beam, in this work we employ machine-learning models that are trained to automatically determine the optimal rules to calculate the optical forces through a large series of known samples, i.e., pairs of physical parameters and corresponding optical forces (which can be obtained either from an experiment or an accurate numerical simulation). Specifically, we employ NNs because they have been one of the most successful tools for machine learning in recent years [29,30]. NNs have outperformed standard approaches to track particles the T-matrix (TM) and similarly fast as the unstructured interpolation (US), while the structured interpolation (S) and the harmonic approximation (H) are even faster. d NN with 5 DoF (position x, y, z, particle radius R, refractive index n): e its MAE and f its speed are still comparable to those of the 3 DoF NN, while the other methods cannot be easily employed as discussed in the text; for convenience of comparison, the gray lines in e and f report the results in b and c, respectively. Error bars in b show the variation between 10 NNs trained on the same training data (see Methods for details). [31,32], to enhance microscopy [33], to predict optical scattering [34], and to calculate hydrodynamic forces [35].
Here, we employ dense fully-connected NNs, an example of which is illustrated in Fig. 1a. These NNs consist of a series of fully-connected layers of neurons, where the output of each neuron is a nonlinear function of the weighted sum of the neuron's inputs. The weights are iteratively adjusted (trained) until the NN learns to associate the correct optical force to each set of physical parameters. To train the networks we used Keras with a Tensorflow backend [36] (see Methods).
To evaluate the performance of this approach, we first explore the simple scenario of the optical forces acting on a spherical particle in an optical trap generated by a Gaussian optical beam. The NN, shown in Fig. 1a, consists of 3 layers of 256 neurons connected to three input and three output neurons. The three inputs are the threedimensional particle position (x, y, z) and three outputs are the components of the optical force acting on the particle (F NN x , F NN y , F NN z ). We train this NN using 10 6 samples randomly distributed around the beam focus, obtained from exact T-matrix simulations [13,37]. Fig. 1b compares the NN predicted forces with the exact Tmatrix forces for 10 6 previously unseen samples, demonstrating very good agreement between the NN predictions and the T-matrix method. Figs. 1c and 1d further demonstrate the excellent agreement between the NN (orange symbols) and the exact T-matrix method (gray lines) both along the transverse x-direction (Fig. 1c) and along the axial z-direction (Fig. 1d), which extends well beyond the linear-force regime around the focal point. Fig. 2 compares more quantitatively the NN and alternative standard methods in terms of both accuracy and speed as a function of their memory footprint. We start by considering the 3 DoF NN discussed in Fig. 1. By varying the number of neurons in the 3 intermediate layers, we directly control the NN complexity and correspondingly the required memory. This allows for a straightforward comparison between the NN memory footprint and the memory needed to store the interpolation data (see Methods). The resulting mean absolute error (MAE) as a function of the memory footprint is shown by the orange line in Fig. 2b (NN-3). The NN accuracy is strongly dependent on the training sample distribution: for optimal training performance, the samples should be clustered around regions where the force varies rapidly. Since in a Gaussian-beam optical trap the forces are largest and vary the most around the beam focus, a convenient choice for the sample distribution is Gaussian-distributed samples around the focus (see Methods). Similar considerations apply for the choice of points for unstructured interpolation (see Methods). For all memory footprints, the NN greatly outperforms both structured interpolation (S, gray squares) and unstructured interpolation (US, gray circles). This improved performance is due to the fact that, while unstructured interpolation requires storing all training samples indefinitely, NN only needs to see the training points during training and is able to encode the same information in a network using significantly less memory. For reference, we show also the MAE of the harmonic model (H, dashed line). We can observe that, while for sparse grids (low memory footprint) both interpolation methods perform worse than the harmonic model, the NN already outperforms it even for the smaller memory footprints. . The exact T-matrix method is the slowest (TM, gray dotted line, ≈ 10 000 s to calculate 10 6 samples), while the harmonic approximation is the fastest (H, gray dashed line, ≈ 0.01 s to calculate 10 6 samples). For both these methods the memory footprint is fixed. The NN (NN-3, orange line) and unstructured interpolation (US, gray circles) perform similarly (≈ 10 s to calculate 10 6 samples) and about three orders of magnitude faster than the T-matrix method almost independently of the memory footprint. Structured interpolation (S, gray squares) is faster than both these methods, but at the expense of a much lower accuracy (Fig. 2b).
The results in Figs. 2b and 2c show that the NN is better in terms of accuracy and similar in terms of speed to unstructured interpolation for systems with ≤ 3 DoF. The NN gains a real edge when increasing the number of DoF. For example, we can consider a problem with 5 DoF, corresponding to the particle position, x, y and z, its radius R, and its refractive index n. In such case, structured interpolation becomes unfeasible because of memory constraints (a 5-dimensional grid with comparable resolution reaches quickly in the Terabytes). Similar memory-management problems emerge when considering unstructured interpolation algorithms. A NN (Fig. 2d) can still be trained with enough samples to learn the 5-dimensional dependence of the force field achieving good accuracy (Fig. 2e) at a similar speed as in the 3-DoF case (Fig. 2f).
We now turn our attention towards scenarios that are almost impossible to accurately simulate without the NN approach. First, in Fig. 3, we consider a particle whose size and refractive index gradually change while held in an optical trap. This scenario is relevant to problems including modelling hydrogel particles swelling over time in optical tweezers or the study of cells undergoing osmotic stresses. More generally, this is an example of a problem where particles deform as they move through optical fields, such as the study of water-glycerol droplets [38] and of the deformation and growth of cells or microorganisms [39]. Using the 5-DoF NN shown in Fig. 3a (the same as that we employed in Figs. 3d-f), we simulate the escape trajectories of a swelling particle held in an optical trap in the presence of a fluid flow (see Methods). As schematically shown in Fig. 3b, a swelling particle, whose radius increases and whose refractive index decreases, is gradually pulled out of the trap when the restoring optical force decreases. This scenario is difficult to explore experimentally or computationally due to the large number of repetitions required to determine the statistics of the escape trajectories. For a rapid swelling (Figs. 3c-e), the particle escapes the trap almost immediately after the flow is turned on. For a slow swelling instead (Figs. 3f-h), the particle initially remains trapped at the edge of the beam before eventually escaping. Without the NN approach, it is practically impossible to collect enough statistics to clearly see how the escape trajectories change. For example, using exact T-matrix calculations it takes us ≈ 50 s to simulate just 5 trajectories (Figs. 3d and 3g), the same amount of time during which we can simulate 10 4 trajectories with the NN (Figs. 3e and 3h). The latter amount of statistics are essential, for example, to determine the final particle distribution with a high degree of accuracy.
The second scenario we consider involves the mixing of two beams with opposite orbital angular momentum (OAM) [40], which generate interesting interference patterns that can be used, for example, to design ratchet-like micromotors [41,42]. Fig. 4a shows a schematic of the experiment. A particle is placed in a mixture of two Laguerre-Gaussian beams of order ±5 mixed into a single beam with a fixed power, so that the resulting beam is α LG 0,+5 + (1 − α) LG 0,−5 , where α ∈ [0, 1]. Different values of α lead to different orbital rates ω for the optically trapped particle. Fig. 4b compares the results of an experiment (black symbols), where we trap a particle pushing it against the microscope slide using the laser beam (see details in Methods), with those we obtain employing a 4-DoF NN (shown in the inset of Fig. 4b) to extensively simulate this system (orange line). We obtain excellent agreement with the experimental results. This strong agreement is further illustrated by the time-averaged experimental (Fig. 4c : Swelling particle escape from an optical trap simulated with a 5-DoF neural network. a 5-DoF NN with inputs for particle position, x, y and z, radius R and refractive index n. b A swelling particle is trapped in the presence of a flow; as the particles swells, its radius R(t) increases and its refractive index n(t) decreases, so that the restoring optical force decreases and, eventually, the particle escapes from the trap. The beam is propagating downwards and the trap centre is marked by the orange cross. c Time dependence of the particle properties R(t) (solid line) and n(t) (dashed line) for a case in which the particle swells quickly. LG 0,-5 LG 0,5 0.5 0.7 1 0.3 0 Figure 4: Dynamics of a particle in a superposition of Laguerre-Gaussian beams. a Schematic of an experiment with a particle held by a superposition of two Laguerre-Gaussian beams of order ±5 with opposite orbital angular momentum, weighted by the parameter α so that the total beam is α LG 0,+5 + (1 − α) LG 0,−5 . b Rotation rates obtained from experiments (black symbols) and from neural-network-based simulations (orange line). The error bars represent standard errors. Inset: schematic of the employed 4-DoF NN. c 20-s experimental trajectory of a particle for various α, leading to a confining potential (α = 0.5), a washboard potential (α = 0.7), and an inclined potential (α = 1). d Corresponding 20-s simulated trajectory for a single particle. e Average of 100 simulated trajectories clearly showing the steady state behaviour in the various cases. The scale bars in c-e represent 1 µm. and simulated (Fig. 4d) trajectories, which represent ≈ 20 s of a single particle's trajectory, revealing three distinct behaviours: When ω ≈ 0 (α = 0.5), we have a series of confining potential wells resulting from the even mixture of two beams with opposite OAM [42]. When ω is significantly larger than 0 (e.g., α = 1), we have a smooth inclined potential around the beam, where the particle slides at an approximately constant rate [40,24]. Finally, in intermediate regions (e.g., α = 0.7), we obtain a washboard potential, where the particle is metastably trapped in local minima, but on average moves around the beam [43]. The NN allows us to rapidly explore the effect of different parameters (including height of the beam relative to the microscope slide, beam power, and viscous drag) on the observed rotation rates and trajectories. Further still, the NN allows us to simulate significantly more accurate probability distributions: For example, Fig. 4e shows the average of 100 simulated trajectories, clearly displaying the steady-state behaviour of this system.
Thus far, we have focused on how NNs can empower fast and accurate simulations for designing, analysing and modelling experiments. Another key advantage of the NN approach is its small memory footprint, which allows easy distribution of pre-trained NNs to calculate optical forces. In this way, multiple users can take advantage of the initial training cost and perform sophisticated numerical simulations on readily available hardware. This enables more collaborative workflows and the possibility of integrating numerically accurate simulations into interactive online demonstrations or teaching material.
In conclusion, we have shown NNs to be a valuable tool for simulating optical forces enabling longer, more accurate, and more memory-efficient simulations. Compared to interpolation and other methods, NNs are easier to implement for problems with many DoF and they use significantly less memory while still performing with similar evaluation times to the most efficient interpolation techniques. In particular, NNs can be used to supplement experiments, including to design trapping configurations or to analyse experimental results. While here we considered only spherical particles, our method can be extended to particles with other shapes by adding rotations as inputs and torques as outputs. Further investigation into different network architectures and training could further improve accuracy and decrease the number of points needed for training. More broadly, the method described here is general enough to be applicable to a wide range of fields including simulations of optomechanics, optoelectronics or acoustics. Although we focus on demonstrating this method for simulating optically trapped particles, we believe this technique will be of interest to the wider optics and photonics community and can be used to model other complex optical processes.

Training of Neural Networks
NN are trained in Python using Keras (version 2.2.2) [36] with TensorFlow backend (version 1.5.0). An example Jupyter Notebook for the 3-DoF NN is provided in Supplementary Information. Training each NN consists of three main parts: loading the data, setting up the NN model, and training the NN on the training data.
Training data for each NN is loaded from a data file containing exemplary inputs and outputs. Values are read from the file, scaled to near unity, randomly shuffled and split into a validation and training set (with a 1/9 split). For our NNs, training data consists of simulation data generated in Matlab (R2018a) using the Optical Tweezers Toolbox (OTT, version 1.5) [13,37]. Importantly, training data could have been generated with another process, for example, by position and force measurement in an experiment.
Setting up the NN model involves specifying the shape and parameters for each NN layer. In this work we explored dense fully connected NNs with few intermediate layers. Details about the number of layers and neurons per layer can be found in subsequent sections. Intermediate layers use rectified linear unit (ReLU) activation functions. The final layer uses a linear activation function. Single precision values are used for NN weights for compatibility with available training hardware.
The NN is trained by optimising the NN weights to minimise a loss function. We use the Keras implementation of the Adam optimiser [36] with default parameters and mean squared error for the loss function. For training the NN, we use gradually increasing batch sizes. The training set is used for training the NN and the validation set is used to evaluate the accuracy of the model at the end of each training epoch. We train the NNs on NVIDIA Tesla V100 graphics cards.

Degrees of Freedom Neural Network
These are NN with 3 inputs, 3 layers of hidden neurons and 3 outputs. For the comparison presented in Figs. 2b and 2c, we train 7 different NNs with 4,8,16,32,64,128, and 256 neurons in each hidden layer. Fig. 1 uses 256 neurons for each hidden layer. Training data is simulated with OTT for a spherical particle (radius 818 nm, refractive index 1.5) in a circularly polarised Gaussian beam (numerical aperture 1.02, medium refractive index 1.3, wavelength in medium 818 nm). Training data consists of 10 6 data points randomly distributed according to a normal distribution with 3.19 µm standard deviation and centred on the beam focus. Error bars in Fig. 2b show average results of 10 NNs trained on the same data.

Mean Average Error
Mean average error (MAE) is calculated according to where F TM is the T-matrix optical force calculated with OTT, F NN is the NN estimate, and N samples is the number of samples in the validation data set. In Figs. 1b, 2b and 2e, the validation data set consists of 10 6 additional locations not used in the training of the NN.

Calculation of memory footprint
Memory footprint depends on the method. Optical tweezers harmonic models typically use between 1 and 9 values (i.e., between 8 and 72 bytes if using double precision values). Structured interpolation stores values on an evenly spaced grid (i.e., a matrix), so that the required memory is directly related to the data type and the number of values stored (i.e., an interpolation grid with 100 × 100 × 100 locations, each location containing 3 double precision values for F x , F y , and F z , has an approximate memory footprint of 24 × 10 6 bytes ≈ 24 MB). Unstructured interpolation requires storing the location and value at each grid point (i.e., for 10 6 unstructured double precision position and force values, we need to store 48 × 10 6 bytes ≈ 48 MB). For the 3 DoF NN, the number of parameters (NN weights and biases) scales as where N is the number of neurons per hidden layer. The memory footprint for the NN is approximately the number of parameters multiplied by the memory per parameter (4 bytes for single precision values). The 5 DoF NN has an additional 2N parameters corresponding to the weights connecting to the two additional inputs, so that the total is

Sampling distribution
For both unstructured interpolation and the NN approach, the choice of sampling locations affects the accuracy of the predicted forces. With infinite samples, interpolation will exactly reproduce the target distribution. With a suitable architecture and training, a NN is capable of emulating interpolation; consequently, the upper bound for the accuracy of a NN is related to the number and distribution of points used for training. For the NN in this work, points are distributed around locations where the particle is likely to be and where the forces vary most rapidly. For Gaussian beam NNs (3 DoF and 5 DoF), this corresponds to a Gaussian distribution centred at the beam focus. For the 4 DoF NN, this corresponds to a toroidal shaped distribution; the shape of the toroid was determined from preliminary simulations which indicated where the particle was likely to be found.

Calculation of evaluation time
To measure the evaluation time shown in Figs. 2c and 2f, we used a 3.4 GHz Intel Core i7-6700 personal computer with 16 GB RAM running implementations of each algorithm in Matlab. Matlab is chosen to allow simple comparison between performance of OTT (which is a Matlab package [13,37]) and the other methods. Structured and unstructured interpolation use Matlab's built-in implementations; and a harmonic model with 1 parameter per dimension is implemented. Evaluation locations matching those used for the MAE comparison are used. Methods are not explicitly parallelised. Additional performance may be achievable using specialised implementations or explicit parallelisation.

Degrees of Freedom Neural Network
NN with 5 inputs, 3 hidden layers and 3 outputs. For Fig. 2, number of neurons in hidden layers is varied, similar to the 3 DoF NN. For Fig. 3, each hidden layer has 256 neurons. Training data is simulated with OTT for a spherical particle in a circularly polarised Gaussian beam (numerical aperture 1.02, wavelength in medium 800 nm, medium refractive index 1.33). Training data consists of 10 7 data points with: refractive index and radius values randomly sampled from a uniform distribution with the range 1.33-2.00 and 0.1-1.0 µm respectively; and positions randomly sampled from a normal distribution centred at the focus and standard deviation of 5 µm.

Escape trajectory simulation
The particle refractive index and radius are given by n(t, σ) = n water + 0.1e −t 2 /2σ 2 and R(t, σ) = 0.5 − 0.2e −t 2 /2σ 2 µm respectively, where t is the simulation time, σ is the decay (growth) rate of the refractive index (size) of the particle, and n water is the refractive index of the medium. The particle is initially trapped by the beam. At 1 ms a flow is enabled causing the particle to move towards the edge of the trap and eventually escape.

Laguerre-Gaussian beam mixture experiment
Particles are confined in two dimensions using mixtures of LG beams. These beams carry orbital angular momentum which is transferred to the particle. Beams are created with a liquid crystal spatial light modulator (Meadowlark Optics, 512x512 HSP512L, high-speed SLM), illuminated by a 1064 nm fibre laser (YLR-10-1064-LP, 10 W, IPG Photonics). The hologram is imaged onto the back-aperture of a 1.2 numerical aperture water immersion objective (Olympus UPlanSApo 60×) and focussed onto the sample. The sample consists of 2 µm polystyrene spheres (refractive index ≈1.59) in water (refractive index ≈1.33) prepared between two microscope coverslips. Particles are axially confined by pushing them against the microscope coverslip with the optical trapping beam. A surfactant is mixed with the water to reduce the chance of particles sticking to the coverslip.

Degrees of Freedom Neural Network
NN with 4 inputs, 3 hidden layers with 128 neurons per layer, and 3 outputs. Training data is simulated for a spherical particle (radius 1 µm, refractive index of particle 1.59) held in different mixtures of LG beams (refractive index of medium 1.33, wavelength in medium 800 nm). In order to accurately model the experiment, training data is generated using beams modelled on the experimentally realised beams (using the paraxial point-matching code from OTT and an approximation for the phase and intensity of the beam at the objective back aperture, numerical aperture 1.2). The experimentally realised beams are approximately LG+/-5 beams except for variations in the intensity profile due to the incident illumination. 10 7 training points are generated. Due to the computational expense in calculating different beam mixtures, only 21 different alpha values are present in the data set, evenly spaced from 0 to 1. For each data point, alpha is randomly selected with a discrete uniform distribution, and position is uniformly randomly chosen from a toroidal volume with inner radius 0.5 µm and outer radius 2 µm and height 1 µm centred on the beam axis 0.5 µm from the focus. The toroid corresponds to locations where the particle is trapped in the LG ring and allows multiple axial positions to be explored in simulations.

Laguerre-Gaussian beam mixture simulation
The simulation is designed to model the experimental observations. The beam power is set to match the experiment rotation rate for a pure LG beam. Axial position is chosen to produce a ring with the same radius as the experimentally observed ring. Particle movement is constrained to prevent motion along the beam axis, emulating the effect of the microscope coverslip. Particle motion is modelled using a finite difference simulation including both the deterministic optical forces and random Brownian motions. The particle in the experiment was observed to occasionally stick to the slide; this is modelled with a frictional term introduced after including Brownian motion.

Funding
This research was funded by the Australian Government through the Australian Research Council's Discovery Projects funding scheme (project DP180101002). I.L. acknowledges support from the Australian Government RTP Scholarship.