Three-dimensional localization refinement and motion model parameter estimation for confined single particle tracking under low-light conditions

: Confined diffusion is an important model for describing the motion of biological macromolecules moving in the crowded, three-dimensional environment of the cell. In this work we build upon the technique known as sequential Monte Carlo - expectation maximization (SMC-EM) to simultaneously localize the particle and estimate the motion model parameters from single particle tracking data. We extend SMC-EM to handle the double-helix point spread function (DH-PSF) for encoding the three-dimensional position of the particle in the two-dimensional image plane of the camera. SMC-EM can handle a wide range of camera models and here we assume the data was acquired using a scientific CMOS (sCMOS) camera. The sensitivity and speed of these cameras make them well suited for SPT, though the pixel-dependent nature of the camera noise presents a challenge for analysis. We focus on the low signal setting and compare our method through simulation to more standard approaches that use the paradigm of localize-then-estimate. To localize the particle under the standard paradigm, we use both a Gaussian fit and a maximum likelihood estimator (MLE) that accounts for both the DH-PSF and the pixel-dependent noise of the camera. Model estimation is then carried out either by fitting the model to the mean squared displacement (MSD) curve, or through an optimal estimation approach. Our results indicate that in the low signal regime, the SMC-EM approach outperforms the other methods while at higher signal-to-background levels, SMC-EM and the MLE-based methods perform equally well and both are significantly better than fitting to the MSD. In addition our results indicate that at smaller confinement lengths where the nonlinearities dominate the motion model, the SMC-EM approach is superior to the alternative approaches.


Introduction
Introduced in the late 1980's [1,2], Single Particle Tracking (SPT) has become an important tool for studying biomolecular motion, particularly inside living cells, with the goal of determining the trajectory of the particle and the parameters that define its motion [3,4]. In recent years there has been growing interest in tracking particles moving in three dimensions using, for example, engineered Point Spread Functions (PSFs) such as the Double Helix PSF (DH-PSF) [5], tetrapod [6], and others [7]. These encode the 3D position of the particle by engineering the shape of the PSF of the instrument. For the DH-PSF, for example, the PSF takes the shape of two (approximately) Gaussian lobes where the lateral position of the particle is at the midpoint between the centers of the lobes and the axial position is given by the rotation of the lobes around this point. In this work we build upon our prior efforts for doing simultaneous localization and parameter estimation from segmented images in SPT data [8,9], focusing on the case of three dimensional confined diffusion and expanding upon our prior formulation to include the DH-PSF.
SPT analysis of a sequence of images typically begins with image segmentation where the raw images are processed to extract image sequences containing information about a single particle [10]. Under the standard paradigm, these sequences are then further analyzed using a two-step approach. In the first step, the initial localization provided by segmentation is refined using methods such as Gaussian Fitting (GF) [11,12] or Maximum Likelihood Estimation (MLE) [13,14]. The resulting trajectory is then analyzed to estimate model parameters, often using a Mean Square Displacement (MSD) analysis [15,16] or, at least for some models, with an MLE [17,18]. These methods are most effective when there is large signal and low background intensity in the images with a Signal-to-Noise Ratio (SNR) of at least five [10,19]. They struggle or even fail outright when the signal intensity gets low [20,21]. However, SPT experiments are often photon-impoverished and subject to significant background levels. As a result, developing methods that work well under these conditions is a key challenge. To maximize the signal level, experiments often use sensitive detectors such as Electron Multiplied Charge Coupled Device (EM-CCD) or scientific Complementary Metal-Oxide Semiconductor (sCMOS) cameras. sCMOS cameras in particular have become quite popular due to their (relatively) low cost, high resolution, frame rate, and quantum efficiency, and their large field of view [14,22]. These cameras, however, bring pixel-dependent noise characteristics that, together with the Poisson-process nature of photon generation, must be accounted for, particularly at low signal levels.
Even with experiments that yield data with high signal levels, confined diffusion is a challenging motion model to estimate as it is fundamentally nonlinear. It is, however, an important mode of motion for molecules moving in the crowded, three dimensional environment of the living cell and has been used, for example, to describe the dynamics of particles inside vesicular compartments [23], particle motion in membranes [24], the motion of T-cells [25], and other biomolecular dynamics (e.g. [26][27][28]). The standard tool for estimating confinement lengths and diffusion coefficients is fitting the MSD to a confined diffusion model [15,26]. While this is a popular technique, it has been shown to be sensitive to user-selected parameters and to noise in the data [17,29]. These limitations led to the development of MLE schemes for linear models such as diffusion, directed, and Ornstein-Uhlenbeck (OU) motion, but the nonlinear nature of confined diffusion makes direct MLE challenging. The parameters extracted from MLE based on an OU model have been used to infer the confinement lengths [18] but this approach does not directly model the confinement.
We previously introduced a methodology for simultaneous localization and parameter estimation that relies on a combination of Monte Carlo filtering to handle nonlinear observation and motion models and expectation maximization for optimal (ML) estimation of model parameters [8,9,30,31]. This approach, known as Sequential Monte Carlo-Expectation Maximization (SMC-EM) takes the sequence of segmented camera images as inputs and jointly estimates the trajectory and the motion model parameters. It has been applied to wide-field images and to time series data from confocal-based SPT [32], as well as to estimate parameters with time-varying motion models [33]. In this work, we build upon that foundation and extend the algorithm to include the DH-PSF imaging modality. Our approach also accounts for the pixel-dependent readout noise that is characteristic of sCMOS cameras and which can have a deleterious effect on localization and estimation algorithms if ignored. While powerful and flexible, this approach does in general suffer from high computational complexity arising from the Monte Carlo filters and the iterative technique of EM. This is even more of an issue when handling confined diffusion as the one step transition function defining that model involves an infinite sum. In this work, we apply an efficient Monte Carlo filter known as the Gaussian Particle Filter (GPF) [34], combine it with a Backward Simulation Particle Smoother (BSPS), and utilize an accurate approximation to the confined diffusion transition density. These modifications, introduced in [30], significantly reduce the computational time without reducing the localization and estimation performance.
To validate the algorithm against a known ground truth, we carry out simulation studies, comparing the performance of SMC-EM against alternative approaches based on the standard paradigm of localize-then-estimate. We generate trajectories by simulating a particle diffusing in a box-like environment with images acquired using an sCMOS camera based on a DH-PSF. For the localization step, we consider two methods. The first is a GF where the intensity data in each segmented image is fit to a pair of Gaussians, one for each lobe in the DH-PSF. The second localization algorithm is a modification of an MLE developed in [14] for localizing a particle based on sCMOS data; here we extend that algorithm to the DH-PSF. From these localizations, we determine the model parameters using either a fit to the MSD or an optimization approach based on MLE methods. While the MLE for the confinement length in each axis is straightforward, the nonlinearities of the confined motion model do not admit a closed-form solution for the diffusion coefficient. Here we make a simple approximation and assume the particle is freely diffusing and then apply the MLE developed in [17] to determine the diffusion coefficient.
The primary contributions of this work are (a) extending the SMC-EM technique to include the DH-PSF for three dimensional motion, (b) demonstrating the ability to do accurate localization and parameter estimation of confined motion at low signal and low SBR levels, and (c) providing a quantitative comparison to the standard method of GF-MSD and to other approaches based on optimal estimation across a range of (very low) SBR levels and confinement lengths.

Confined motion
In this work, we consider a particle of interest diffusing in a rectangular "corral-like" environment. That is, the motion in each axis is limited to the interval [−L/2, L/2]. For simplicity we take L to have the same value for each axis. The one-step transition density in a single axis is given by [26] p where the unknown parameters D x and L x denote the diffusion coefficient and confinement length, respectively, and ∆t denotes the time interval between two consecutive image frames. The particle motion in each dimension is independent and the joint transition density is thus given by the product of three copies of Eq. (1), one for each of x, y, and z. The density in Eq. (1) involves an infinite series and one typically takes the value of N p "large enough" to ensure an accurate representation of the transition density. In prior work we developed a quantitative approach for selecting N p for a desired approximation error and showed that while the choice depends on both L x and D x , a value of N p ∼ 20 is typically sufficient; see [30] for details. Realizations of the process defined by Eq. (1), needed for the SMC-EM algorithm as well as to generate simulated data can be obtained using a simple acceptance-rejection algorithm [35].

DH-PSF
The DH-PSF, illustrated in Fig. 1, uses a phase plate in the output path of the microscope to generate a PSF such that the particle is at the center between two lobes that rotate around each other as the particle moves along the axial direction [7]. While the true shape of the DH-PSF is non-trivial, it can be well-approximated by a pair of Gaussian lobes [36], where G denotes the peak signal intensity; (x i , y i ) are the centers of the lobes, given by where r is a constant and (x p , y p ) is the location of the particle. The angle of the lobes, θ, is a monotonic function of the axial position with a specific relationship that depends on the instrument. This relationship is typically obtained by generating a calibration curve by scanning a single feature along the axial direction and measuring the resulting PSF [7]. In this work, for simplicity we take a simple linear model where k is a constant. Fig. 1. The DH-PSF generates two lobes that rotate in the x − y plane as the particle moves in the z direction. (left) A sequence of simulated images of a particle at different z locations.
(right) an equi-intensity profile of the DH-PSF in (x, y, z), demonstrating the choice of the "double-helix" name.

sCMOS camera model
Data in SPT experiments are typically given by a series of camera frames that are pre-processed into segmented images of P pixels arranged into a √ P × √ P square array. The pixel size is ∆x by ∆y with the actual dimensions determined both by the physical size of the elements on the camera and the magnification of the optical system. Photon generation is described by a shot noise process and at time step t, the expected photon intensity measured for the p th pixel is where (x t , y t ) denotes the position of the fluorescent particle, the integration bounds are over the given pixel, and PSF(·, ·) is the point spread function of the instrument. In addition to the signal, there is always a background intensity rate arising from out-of-focus fluorescence and autofluorescence in the sample. While the background depends on the local environment, it is typically modeled as a uniform rate N bgd across the small range of the P pixels. Combining these signals, the measured photon counts in the p th pixel at time t is given by statistics of ϵ are independent of the particular pixel and the readout noise can often be neglected. For sCMOS cameras, the readout noise is pixel-dependent and incorporating this dependency into the model is important both to generate realistic simulations and to produce accurate estimation from the data. Following [14] we model the readout noise as where σ p,t is the standard deviation of the readout noise, and Var p,t and g p,t are called the variance and gain of pixel p at time t, respectively. Combining Eq. (6) with Eq. (7) yields the probability density function (PDF) of the measured photon counts in pixel p at time t, A typical frame generated from this model is shown in Fig. 2. The image was generated using Eq. (8) with the rate λ given by the DH-PSF with G = 30, a background rate of N bgd = 10, and variance and gain maps for the pixels as shown in the figure.

Localization and estimation using SMC-EM
SMC-EM, shown in Fig. 3, is a framework for simultaneous localization and parameter estimation that uses the well-known EM algorithm to find a (local) MLE of an unknown parameter. In this section we give a brief overview of the approach. Consider, then, the problem of identifying an unknown vector parameter θ ∈ R n θ for the nonlinear state space model where x ∈ R n x is the state, y ∈ R n y is the observation, w t is the process noise with a given distribution (which may depend on θ), and v t is the observation noise with a given distribution (that may also depend on the unknown parameter). (Note that in this section only, we are using the notation x t and y t to refer to the state and observations of the general model in Eq. (9), not the position of a tracked particle in the plane.) Our goal is to find an MLE of the parameter θ from the data Y N ≜ {y 1 , . . . , y N }, given bŷ where p θ (·) is the joint PDF over the observations. This optimization can only be solved in closed form in certain simple cases as p θ (Y N ) is typically intractable. The EM algorithm approaches this problem by defining a hidden (or latent) variable; in the context of SPT this hidden variable is exactly the trajectory of the particle. EM moves towards the maximum of the log likelihood through iterative optimization of a function Q given by whereθ (e) is current estimate of the parameter and E is the expecation operator. The calculation of Q (︂ θ,θ (e) )︂ is called the Expectation (E)-step at the e th iteration. It has been shown that any )︂ also increases the original likelihood [37]. Thus, the E-step is followed by a Maximization (M)-step to produce the next estimate, Following [38], we decompose (11) into where To determine the distributions needed in (14), we use a slightly modified version of the Gaussian Particle Filter (GPF) that we refer to as the simplified GPF (sGPF), combined with the Backward Simulation Particle Smoother (BSPS) (see Appendix A for details). Under SMC-EM, the Q function is then approximated by where K denotes the number of simulated trajectories used in the BSPS andx k t |N denotes the k th simulated trajectory at time t. This (approximation) to the Q function is then maximized to find the next parameter estimate before proceeding back to the E step. Details on the M-step in the context of SPT are given in Appendix B. This process is repeated until a termination criterion is met. Based on experience, in this work we terminate after a finite number of iterations (taken to be 15).
With a dataset consisting of N images, using M Monte Carlo samples in the sGPF and K = M simulations in the BSPS, the time complexity of this version of SMC-EM is O(NM 2 ); while this is equivalent to previous work in [8] based on a Sequential Importance Resampling (SIR) filter and particle smoother, the use of BSPS provides the ability to parallelize computations that is not afforded by a standard particle smoother and in particular reduces the time complexity of carrying out the E step of the EM algorithm. The significant reduction in computational complexity is especially important when considering motion in three dimensions as it allows for a much larger number of Monte Carlo samples for the same run-time. A detailed run-time comparison between the previous and current versions of SMC-EM can be found in the Supplemental materials.

Demonstration and results
In this section we use simulations to demonstrate SMC-EM and compare its performance against algorithms based on the standard localize-then-estimate paradigm. We assume segmentation of the raw images to a sequence of images containing the DH-PSF of the single particle has been performed. This is, of course, a non-trivial task, especially at low signal levels. Our interest here, however, is on the relative performance of the estimators and by beginning with the segmented image sequences we can avoid confounding effects of segmentation.
The performance of SMC-EM depends on the number of Monte Carlo samples; we denote this with a superscript so that, for example, SMC 100 -EM uses 100 samples. For the comparison, we use two approaches for stand-alone localization. The first is based on a GF to each lobe of the DH-PSF in the image to determine their centers; the particle location is then determined from Eqs. (3) and (4). The second is MLE sCMOS , developed in [14] for estimating two-dimensional particle locations from sCMOS camera data, and modified here to include the DH-PSF for 3D localization. Details of these localization methods can be found in Appendix C. Once the particle trajectory has been determined, we estimate model parameters using either the MSD or a fast, MLE-like scheme. For the MSD, we perform a nonlinear fit to the MSD based on the confined diffusion model in [26]. For the MLE-like scheme, we use the true MLE for the confinement length; as shown in [8] this estimator is simply given by the range in the estimated trajectory in each axis. However, the nonlinear nature of the motion model precludes a simple solution of the MLE of the diffusion coefficient. Rather than utilizing a numerical solver, we apply a computationally efficient approximation to MLE for the diffusion coefficient assuming a free diffusion model as developed in [17,29]. We denote these combinations of algorithms using localization-estimation so that, for example, GF-MLE refers to localization using a Gaussian fit and parameter estimation using the MLE-like scheme, while MLE sCMOS -MSD refers to localization using the MLE sCMOS algorithm and parameter estimation using a fit to the MSD.
Simulations were made of a particle diffusing in 3D, confined to a cube centered at the origin with dimensions of 500 nm. Each simulation consisted of N = 100 frames at an imaging rate of 10 frames/s (for a period of ∆t = 100 ms). In practice, cameras accumulate photons over an integration period and the resulting motion blur arising from particle motion during integration is an important factor in performance. To replicate this effect, trajectories of length N × N sub were generated where N sub represents a sub-sampling factor. An image is generated at each of the subsample steps in the shutter period δ t . These are then averaged together to produce the final image for that imaging period to incorporate the effect of motion blur. The ground truth position for that image is taken to be the average of the particle position over the shutter period. The parameter settings for all simulations are summarized in Table 1. To generate statistics on algorithm performance, 50 datasets were simulated for each experimental setting. To investigate the behavior of the algorithms at low signal levels, we varied the peak intensity G across the range of 10-50 counts, holding the background rate fixed. The SNR of the images can be defined as [39,40] where F n is the excess noise factor (F n = 1 for sCMOS cameras) and the other terms are defined in Table 1). Because we directly set the background and peak intensity level, we report results as a function of Signal-to-Background Ratio (SBR), defined by From Eqs. (16) and (17), the corresponding SNR and SBR levels considered in the simulations are shown in Table 2. Typical images generated by the DH-PSF of a single particle under these SBR conditions are shown in Fig. 4.

Performance across different signal-to-background ratio (SBR)
We measure localization performance using the Root Mean Square Error (RMSE) over the entire trajectory as compared to the ground truth. Due to the motion blur effect, the notion of "ground truth" must be defined; here we use the average of the positions of the true trajectory over the integration period. The results over these simulations are shown in Fig. 5 as a function of SBR with GF shown in black/gray, MLE sCMOS in red, SMC 100 -EM in green, and SMC 1000 -EM in purple. The mean for each algorithm is shown as a dashed line, the median as a dotted line and the shadowed region shows the 10%-90% quantiles of the estimates. The results indicate several interesting features. First, as expected, increasing the number of Monte Carlo (MC) samples used in SMC-EM improves localization performance. Second, the algorithms differ primarily at the lowest SBR levels, performing similarly after an SBR of 4. At lower signal levels, the GF shows the lowest accuracy and, in the limiting case of SBR=1 is essentially equivalent to taking the center of the segmented image as the particle location. Both MLE sCMOS and SMC-EM, however, yield good localization even at the lowest SBR levels, with SMC-EM showing the best results. An estimation result on a typical trajectory (with SBR=1) is shown in Fig. 6. All of the algorithms follow the general trend of the true trajectory. GF, however, yields multiple large outliers, particularly in the z-direction, driving its larger RMSE. The MLE sCMOS is more variable than the SMC-EM methods which use data from the entire sequence of images and the motion model when inferring the particle location. Because MLE sCMOS consistently outperforms the GF approach, we do not consider the GF method further.
The results on parameter estimation using the different schemes are shown in Fig. 7 for the diffusion coefficient and in Fig. 8 for the confinement length. The results from MLE sCMOS -MSD are shown in light blue, those from MLE sCMOS -MLE in red, from SMC 100 -EM in green, and from SMC 1000 in purple. As before the mean for each algorithm is shown as a dashed line, the median as a dotted line, and the shadowed region shows the 10%-90% quantiles of the parameter estimate. These results clearly show that a fit to the MSD yields the worst performance in estimating the model parameters with both a larger bias and a much larger range of estimates. In general, the SMC-EM methods and MLE sCMOS -MLE have very similar performance for estimating the diffusion coefficient, though the SMC-EM methods have a smaller spread and bias. The exception is in the z axis where the SMC-EM methods exhibit a larger positive bias than MLE sCMOS -MLE at an SBR of one but otherwise shows the same trend. At SBRs of two and above, this bias remains consistent and, while small, is non-zero. There is some evidence in our prior work that the primary driver of this bias is the relatively short length of the trajectories; as the datalength is increased, this bias decreases [30].  The confinement length results show that at SBRs of 4 or above, the SMC-EM and MLE sCMOS -MLE approaches are quite similar. As the SBR decreases, however, the MLE sCMOS -MLE overestimates the length while the EM approaches remain accurate, though with an increased spread in the estimates at lower SBR.

Performance across different confinement lengths
It is reasonable to expect that estimation performance will depend on the confinement length. If L is very large, the particle will not interact with the boundaries very often and you may need very long runs to get an accurate estimate of the confinement. However, because the motion essentially becomes free diffusion, estimating the diffusion parameter is likely to be easier. Conversely, with small values of L, the nonlinearities in the motion model due to the confinement will be dominant. Because the particle is very likely to interact with the boundaries, even in short trajectories, estimation of L may be easier while the estimation of D may suffer since the motion no longer looks like diffusion. This is particularly likely in the MLE sCMOS -MLE algorithm since the diffusion estimation assumes a free diffusion model.
To explore this effect, we performed simulations for confinement lengths ranging from 0.1 µm-0.5 µm. Because MLE sCMOS outperforms GF and MLE outperforms MSD, we compared SMC-EM only to MLE sCMOS -MLE. Based on the results in Sec. 4.1, we expect the algorithms to differ at the lowest SBR but to perform similarly at higher SBR. We therefore show results here for SBR=1 and SBR = 5. Results for the range of SBRs between these extremes can be found in the Supplemental Materials. We also found that confinement length had little effect on localization and focus here on parameter estimation. Results for localization performance as a function of confinement length can be found in the Supplemental Materials.
The estimation results for SBR=1 are shown in Fig. 9. In this figure, results from MLE sCMOS -MLE are shown in red, from SMC 100 -EM in green, and from SMC 1000 -EM in purple. The mean for each algorithm is shown as a dashed line, the median as a dotted line, and the shadowed region shows the 10%-90% quantiles. The true parameter values are shown as a solid black line.
The results indicate that SMC-EM accurately captures the confinement length across the entire range of L considered. MLE sCMOS -MLE has the correct trend but shows a bias that decreases as the confinement length increases. Because the MLE for the confinement length is given by the range of the estimated trajectory, this likely reflects a larger propensity for occasional outliers in localization. Because the SMC methods utilize the motion model in localization as well as in parameter estimation, such outliers are much less likely. Similarly, the SMC-EM methods are more reliable for diffusion coefficient estimation at smaller confinement lengths. At the smallest confinement lengths, the particle spends most of its time interacting with the boundary of the domain and has limited mobility. As a result the diffusion coefficient is significantly underreported. The estimate improves with increasing L, with the SMC-EM methods consistently outperforming MLE sCMOS -MLE.
Estimation results for SBR=5 are shown in Fig. 10. Because localization is more accurate at the higher SBR, the estimation of confinement length using MLE sCMOS -MLE is more accurate, matching the performance of SMC-EM. At the smallest confinement lengths, however, MLE sCMOS -MLE still underestimates the diffusion parameter, performing substantially worse than the SMC-EM approach.
An alternative metric of performance in parameter estimation is the success rate defined as the percentage of trials with parameter estimates within 25% of the true value [29]. While this is a coarse metric, it provides a simple, concise means of comparing performance across a wide range of conditions. Heat maps showing the success rate of the algorithms across all values of SBR and confinement length we considered are shown in Fig. 11 for the confinement length and in Fig. 12 for the diffusion coefficient. In each figure, the columns show results in the {x, y, z} directions, the first row is for MLE sCMOS -MLE, the second for SMC 100 -EM, and the third row for SMC 1000 -EM. The general trend for the confinement length is that both MLE sCMOS -MLE and SMC-EM accurately estimate this parameter at higher SBR and larger L. As either SBR or confinement length is reduced, the SMC-EM methods outperform MLE sCMOS -MLE. Additionally, as expected the larger number of MC samples in SMC 1000 -EM improves performance, at the cost of additional computation time (see the Supplemental Materials for runtime details). This general trend is repeated in the diffusion coefficient estimation, with two notable differences. First, the heat maps indicate that at the smallest confinement length and lowest SBR, the SMC-EM methods show a large increase in success rate for the axial diffusion coefficient. Referring back to Fig. 9 shows that this is driven by a very large increase in variability of the SMC-EM estimates under these conditions rather than a true improvement in performance. Similarly, at the smallest  confinement lengths, SMC 100 -EM has a higher success rate than SMC 1000 -EM. This again is  due to the increased variability in the diffusion coefficient estimates, driven now by the smaller number of MC samples in the filters.

Discussion and conclusion
The conditions considered in this work, namely confined diffusion under low signal levels, are challenging for SPT estimation but are important from an applications point of view. Joint localization and estimation using SMC-EM techniques take advantage of all the data as well as motion model that connects data points. While SMC-EM has larger computational complexity than approaches following the localize-then-estimate paradigm, these simulation results indicate that the approach provides a significant benefit in localization and parameter estimation at low light levels and when the nonlinearities in the motion model that define the confinement dominate the particle behavior. For concreteness, we considered specific values of these parameters in the simulations but it is intuitively clear that the nondimensional ratio √ 2D∆t L defines the impact of the confinement on the trajectory in a given axis. For large values of L, small values of D, or fast imaging (and thus small ∆t), the motion will look very similar to free diffusion and, conversely, at small values of L, large values of D, or slow imaging, the confinement will dominate the dynamics. At higher signal levels, and when diffusion dominates, the standard paradigm works well. However, even under these settings it is important to note that the fit to the MSD, while the simplest approach to implement, has significantly worse performance than methods based on MLE.
While this paper focused on confined diffusion and observations using the DH-PSF, it is important to note that the SMC-EM framework is easily applied to a broad range of motion, PSF, and camera models, through appropriate modification to the underlying distributions used in the algorithm, and has been shown to be an effective approach, particularly under the low light conditions [9] and for models with time-varying parameters [33].

A. Filtering and smoothing algorithms
The Gaussian Particle Filter (GPF) [34] approximates the filtered distribution as a Gaussian (though extensions to more general distributions through propagating higher moments are straightforward). As with all such filters, the GPF consists of a measurement update step and a time update step. In the measurement update step, particles are drawn from an importance sampling function, π(x t |Y t ). The choice of this function is informed by the specific problem (see [42][43][44] for details) but often this is selected to be the prediction distribution (a normal distribution with a mean and covariance determined in the time update step). These samples are weighted based on the measurement distribution and their sample mean and covariance calculated to produce the filtered distribution. In the time update step, samples are drawn from this filtered distribution, propagated through the motion model, and the sample mean and covariance determined to produce the prediction distribution. In a simplification known as the alternate GPF (aGPF), the weighted samples from the filtered distribution are reused in the time update, avoiding the need to resample particles.
As shown in Alg. 1, in this work we make a small modification to the GPF to further reduce its computational cost. Specifically, in the time update step, we select all the particles to be at the mean of the filtered distribution. These are then propagated through the motion model as usual. We also follow a suggestion of [34] and use these propagated particles in the measurement update rather then drawing from the importance function. With these choices, we avoid the need for resampling from any distribution, simplify the implementation of the propagation through the motion model, and avoid the need to calculate explicitly the covariance matrix of either the filtered of the predicted distribution.
To initialize the filter at time t 0 , the first samples are drawn from an importance function defined as a uniform density over a cubic volume centered at (x o , y o , 0), where (x o , y o ) define the center of the first image, with a side length of twice the pixel size ∆x (see Table 1). This is a somewhat arbitrary choice and can be easily modified if there is additional information available or if a larger uncertainty is needed at the initial time. We note that there are other variants of BSPS with even better computational complexity, such as BSPS with rejection sampling [45]. Such approaches, however, must compute upper bounds on the state transition density and we find the cost of doing so for the confined model considered here makes such methods more costly than simple BSPS. One of the benefits of BSPS, however, is that the simulated trajectories are independent to each other. Our implementation takes advantage of this to simplify the computational load by parallelizing computations when possible.

B. M-step for parameter estimation
Under the confined diffusion model, there are six parameters to estimate: the confinement lengths and diffusion coefficients in each of the three axes. We note that additional parameters could be estimated as well, including the peak intensity parameter G and the background rate N bgd . Inserting the motion model into Eq. (15), the Q function under SMC-EM shows that the parameters only appear in the I 2 term. In what follows we describe only the parameters in the x-axis; the remaining two axes follow the same procedure. The M step at the e th iteration is where p(·|·) is the one step transition probability of confined diffusion given in Eq. (1), N is the number of time points in the trajectory, K is the number of simulations used in the BSPS, and x e,k t |N is the value of the k th simulated trajectory at the t th time step. Maximization with respect to the confinement length does not depend on the diffusion parameter and yieldŝ L e+1 It is important to note that the MC samples generated at the e th iteration are always within the interval [︁ −L e /2,L e /2 ]︁ . This implies that (a) the initial guess of the confinement length must be larger than the true value (in absolute value) and that (b) the estimated trajectory is bounded by the estimated confinement length and is guaranteed to hit either the minimum or maximum extent (or both) at least once. This helps to prevent large outliers in estimation.
To estimate the diffusion coefficient, we follow [8] and numerically search for the root of log p(x e,k t+1 |N |x e,k t |N ).
Note that the density p(·) in Eq. (20) depends on both D x and L x . However, the estimate for L x can be calculated from Eq. (19); thusL e+1 x should be used when evaluating Eq. (20). In this work we numerically find the root of this expression using a bisection search.

C. Localization-only schemes
Gaussian fitting (GF) to the DH-PSF As given in Eqs. (2)-(4), the DH-PSF can be approximated by two lobes, each with a Gaussian profile, that rotate about the lateral position of the particle as that particle moves along the optical axis. To estimate the particle location, we fit the image data to the model in Eq. (2) using a nonlinear least squares scheme to determine the centers of the two lobes (x 1,2 , y 1,2 ), as well as the parameters G and σ xy . To initialize the optimizer, we selected the center of the brightest pixel in each lobe. Once the center locations are determined, the midpoint of the line connecting them and the angle of that line with respect to the horizontal axis are calculated. The lateral position of the particle is then taken to be the midpoint and the axial position is inferred from Eq. (4).

Maximum likelihood estimation for sCMOS (MLE sCMOS )
From [14], the probability distribution function for the measured signal h from pixel p with both shot noise and read-out noise can be approximated by whereĨ p,t denotes a Poisson process whose rate is determined by the sum of the background rate and the rate λ p,t (θ xyz,t ) defined by the PSF of the instrument and the position of the particle (denoted here as θ xyz,t ), Γ(·) is the standard Gamma function, N bgd is the background intensity rate, and σ p,t is the standard deviation of the pixel noise defined in Eq. (7) and given by the gain, g p,t and variance Var p,t . MLE sCMOS , the MLE for the particle position θ xyz,t , is then given bŷ θ xyz,t = arg min P sCMOS (h =Ĩ p,t + σ 2 p,t |λ p,t (θ xyz,t ), N bgd , g p,t , Var p,t ) where P denotes the total number of pixels in an image. Using the DH-PSF given in Eq. (2) for the rate λ p,t , and defining h =Ĩ p,t + σ 2 p,t , this estimate takes the form θ xyz,t = arg min θ xyz,t P ∑︂ p=1 [︁ (λ p,t + N bgd + σ 2 p,t ) − h log(λ p,t + N bgd + σ 2 p,t ) + log Γ(h + 1) ]︁ .