1-regularized maximum likelihood estimation with focused-spot illumination quadruples the diffraction-limited resolution in fluorescence microscopy

Super-resolution fluorescence microscopy has proven to be a useful tool in biological studies. To achieve more than two-fold resolution improvement over the diffraction limit, existing methods require exploitation of the physical properties of the fluorophores. Recently, it has been demonstrated that achieving more than two-fold resolution improvement without such exploitation is possible using only a focused illumination spot and numerical post-processing. However, how the achievable resolution is affected by the processing step has not been thoroughly investigated. In this paper, we focus on the processing aspect of this emerging super-resolution microscopy technique. Based on a careful examination of the dominant noise source and the available prior information in the image, we find that if a processing scheme is appropriate for the dominant noise model in the image and can utilize the prior information in the form of sparsity, improved accuracy can be expected. Based on simulation results, we identify an improved processing scheme and apply it in a real-world experiment to super-resolve a known calibration sample. We show an improved super-resolution of 60nm, approximately four times beyond the conventional diffraction-limited resolution.


-regularized maximum likelihood estimation with focused-spot illumination quadruples the diffraction-limited resolution in fluorescence microscopy: supplementary material
Here, we describe implementation details regarding the processing schemes used in this paper. All numerical experiments and data processing in this paper are done using MATLAB. For our numerical experiments, to ensure reproducibility, we generate the noisy images with MATLAB's random number generator set to a fixed state. To maintain the focus of this paper on the numerical problems themselves instead of the numerical routines used to solve them, we use standard numerical solvers such as quadprog in MATLAB, and the CVX package. For more sophisticated numerical problems, we develop custom cost functions, along with their gradients and Hessians to be used with MATLAB's fmincon command. All MATLAB scripts necessary to recreate the results in this paper are available upon request.

IMPLEMENTATION OF THE PROCESSING SCHEMES USED IN SIMULATION A. NNLS and WNNLS
For NNLS, we solve the following numerical optimization problem: (S1) Definitions for the terms in Eq. (S1) can be found in the main text. To solve Eq. (S1), we use MATLAB quadratic programming quadprog with lower bound for x set to zero. We use the default setting for other quadprog parameters. To ensure convergence of the quadprog solver with the default setting, we perform the simulation with different parameters such as a much stricter stopping criterion, and observe no significant difference than when default settings are used. For WNNLS, we modify NNLS such that each pixel is weighted by its variance. The modified optimization problem is where W is a diagonal matrix that contains the weights for each pixels. The matrix W is zero everywhere except on the diagonal, and the elements on its diagonal W diag are calculated by where σ 2 is the variance for the Gaussian readout noise of the simulated sCMOS camera. To solve Eq. (S2), we weight H PSF and I noisy beforehand, and use quadprog with the same setting as NNLS to solve the modified numerical problem.

B. r-NNLS and r-WNNLS
For r-NNLS and r-WNNLS, the formulation of the numerical problems is altered slightly by adding a sparsity inducing 1 term. The modified numerical problems are: x r-WNNLS = argmin We utilize the CVX package [1] to solve Eq. (S4) and Eq. (S5). We execute these two processing schemes with parameters cvx_precision set to best and cvx_solver set to SeDuMi. Selection of the regularization parameter λ is covered in section 2 of this supplementary material.

C. Poisson-MLE and VST
For Poisson-MLE and VST, we solve the following two numerical problems: and In Eq. (S6) and Eq. (S7), i is the index referring to the i-th element in a vector, and n is the total number of pixels in the acquired or reconstructed image. Using the CVX package for Eq. (S6) and Eq. (S7) (and subsequently, their regularized counterparts) is prohibitively slow. To overcome this problem, we derive expressions for customized cost functions (and their respective gradients and Hessians) to create a more efficient matrix-operation-based implementation. These expressions are then used by MATLAB constrained optimization fmincon to find a solution. We include the scripts for calculating the cost function value, gradient, and Hessian for Poisson-MLE and VST in a GitHub repository [2].

D. r-Poisson-MLE and r-VST
For r-Poisson-MLE and r-VST, we solve the following modified numerical problems: and (S9) Similar to Eq. (S6) and Eq. (S7), we derive customized cost functions and their respective gradients and Hessians to be solved using fmincon. Selection of the regularization parameter λ for r-Poisson-MLE and r-VST is covered in section 2.

SELECTION OF REGULARIZATION PARAMETER
Regularized processing schemes discussed in this paper require a regularization parameter, λ, to be selected. An appropriate value for λ is crucial in achieving good performance. We demonstrate this in Figure S1. If λ is too small, the performance of the regularized processing scheme will show no significant improvement compared with that of the non-regularized processing scheme. If λ is too large, the processing scheme will be biased towards an all-zero solution in successive runs, regardless of the acquired image. Note that the processing scheme is still behaving correctly: it is still minimizing the correct cost function. However, if λ is too large, the optimization problem effectively becomes which has a trivial solution of x being a zero vector. This can be seen in Fig. S1: the processed images for the over-regularized case (bottom row) shows pixels with extremely low intensities. A straightforward solution is to manually tune the regularization parameter until the processed image is "visually pleasing." However, in our technique, where tens of thousands of small, intermediate images are collected, manually tuning for each acquired intermediate image is prohibitively slow (we provide an example of this in the next section). Additionally, because only a very small region of the sample is illuminated for a raw image, it is very difficult to determine if a processed image is "visually pleasing" as it usually consists of only a small number of bright pixels. While a single universal regularization parameter can be selected and applied to all acquired images, it is unlikely to produce optimal results. This is because, as we state in the main text, recorded images will show varying levels of sparsity as the illumination spot is scanned across the sample when different parts of the sample are illuminated, making a universal λ unlikely for optimal results. Therefore, an automatic selection scheme that uses acquired data only, is needed to select an optimal λ for each acquired image. While this is not a trivial problem, techniques such as the L-Curve [3,4] and generalized cross-validation (GCV) [5] have been shown to be viable options. These methods utilize a goodness-of-the-fit measure and a degree-of-freedom measure to find a good trade-off between over-and under-regularization. In the simulation result in this paper, we adopt both methods to select λ automatically for different processing schemes. For the goodness-of-the-fit term, we use the value of the unregularized term of the regularized cost function, sometimes referred to as the "fidelity term" in existing works. For the degree-of-freedom measure, we use the 1-norm of the solution, following an existing example of adopting L-curve to work with 1 regularization, shown in [6]. As in, for Eq. (S4), the goodness-of-the-fit term, f 1 would be: and the degree-of-freedom measure, f 2 would be: In Eq. (S11)) and Eq. (S12), x λ is the solution to the numerical problem when the regularization parameter is set to λ.
To identify an optimal λ, we first determine a range of λs in which the search is conducted. Next, we generate a series of sample points within that range and calculate the goodness-of-the-fit measure and the degree-of-freedom measure. We then interpolate these values to cover the entire range of λs in the search range. For GCV, we then calculate the GCV curve and use the λ value that gives the lowest GCV value. For L-Curve, we adopt the expression for the analytical curvature from [7] and use the λ value that produces the greatest curvature.
In our simulations, based on observed performance, we use L-Curve for r-NNLS, r-WNNLS, and r-Poisson-MLE, and GCV for r-VST.

IMPLEMENTATION OF R-VST FOR PROCESSING EXPERIMENTAL DATA
While L-Curve and GCV provided a good estimate of λ in simulation, as we attempted to apply r-VST to process real-world experimental data, with a λ value selected by GCV, we encountered some practical challenges. First of all, the processing time was prohibitively long. For the Argolight-sample experiment (Fig. 6 in main text), 9 × 10 4 small intermediate images were collected. In our simulation, we sampled the range of λ with 10 sample points. For each sample point, an optimization problem needed to be solved, meaning that there were 9 × 10 5 optimization problems to be solved in total. For our matrix-based implementation of the cost function, a typical run time for solving one image at a λ sample point was about 1 second. This corresponds to a total run time of 9 × 10 5 seconds, or roughly 10.4 days, which is impractically long for general biological imaging purposes. In addition, we discovered that GCV does not produce an effective λ value, but instead converges to the two ends of the search range, resulting in either an underregularized or over-regularized image. We also observed similar phenomenon when we used L-Curve to select an λ for r-Poisson-MLE. We believe this could be caused by the background fluorescence in conducting a real-world experiment. We observed the similar phenomenon (data not shown) in the two-dot test object simulation. We added a uniform background to the test object with a value of 10% of the photons emitted by a single dot, and found that both L-Curve and GCV produced regularization parameters that severely under-regularized the processed images.
Because of these complications, we implemented the following alternative optimization problem to process all of the experimental data: x r-VST,alt = argmin (S13) In Eq. (S13), instead of λ, the processing scheme requires a different input β, to specify the "strength" of applied regularization. It has been shown that formulations like Eq. (S13) can be equivalent to Eq. (S9) for the appropriate selection of λ and β [8]. Using this alternative formulation, we found that instead of searching for an appropriate regularization parameter for each raw image, satisfactory results could be obtained by specifying a universal β. In Fig.6 of the main text, where the r-VST result is shown (We did not develop a alternative formulation for r-Poisson-MLE, due to its similar performance to r-VST in simulation), we manually tuned a range of βs until the maximum resolution was achieved and verified (by plotting a diagonal profile). In practice, a human user will ultimately utilize these super-resolved images to solve scientific problems. In these cases, the processing scheme can output several candidate images, each processed with a different regularization parameter (i.e., different values of β). The user will then select which is the most appropriate for their application.