Computationally-efficient spatiotemporal correlation analysis super-resolves anomalous diffusion

Anomalous diffusion dynamics in confined nanoenvironments govern the macroscale properties and interactions of many biophysical and material systems. Currently, it is difficult to quantitatively link the nanoscale structure of porous media to anomalous diffusion within them. Fluorescence correlation spectroscopy super-resolution optical fluctuation imaging (fcsSOFI) has been shown to extract nanoscale structure and Brownian diffusion dynamics within gels, liquid crystals, and polymers, but has limitations which hinder its wider application to more diverse, biophysically-relevant datasets. Here, we parallelize the least-squares curve fitting step on a GPU improving computation times by up to a factor of 40, implement anomalous diffusion and two-component Brownian diffusion models, and make fcsSOFI more accessible by packaging it in a user-friendly GUI. We apply fcsSOFI to simulations of the protein fibrinogen diffusing in polyacrylamide of varying matrix densities and super-resolve locations where slower, anomalous diffusion occurs within smaller, confined pores. The improvements to fcsSOFI in speed, scope, and usability will allow for the wider adoption of super-resolution correlation analysis to diverse research topics.


Introduction
Anomalous diffusion in confined and crowded environments plays a role in the function of material and biophysical systems. In smart hydrogels, the anomalous diffusion of analytes within stimuli-responsive polymers are used for drug delivery and separation applications [1][2][3][4]. The heterogeneous environment of the extracellular matrix causes anomalous diffusion of signaling proteins, nucleic acids, or small molecule therapeutics that can affect the time scale of cellular changes [5]. Similarly, the topologic complexity of the intracellular space with cytoskeletal filaments and macromolecular crowding leads to anomalous diffusive behavior as biomolecules within the cell [6][7][8]. The effect of the heterogeneous nanostructures found in soft materials on the anomalous diffusion of molecules has important consequences in the use of biological, industrial, and medical materials.
Characterizing the interplay between the nanoscale structure of a material and anomalous diffusion is difficult with existing techniques. Scanning and transmission electron microscopy (SEM/TEM) require the sample to be prepared with flash-freezing or fixation that can warp the nanostructure of the materials, and cannot be used to quantify diffusion dynamics [9]. Atomic force microscopy (AFM) can also not quantify diffusion within materials and is limited to surface imaging only [10]. Optical techniques such as fluorescence correlation spectroscopy (FCS) and fluorescence recovery after photobleaching (FRAP) can accurately measure diffusion [11][12][13][14][15][16][17], but the diffraction-limited nature of light prevents the techniques from examining the heterogeneity of porous structures at nanoscales [18]. Super-resolution optical fluctuation imaging (SOFI) can super-resolve structure alone by analyzing wide field image stacks correlation of material's stru Correlatio optical micro information. I two correlati resolved spati combines the molecules (Fi (Fig. 1c) (Fig. 1b) and c the resulting co (Fig. 1d). FC (Fig. 1e)  to extract diffusion coefficients more accurately and increase the spatial resolution of mapping pores more effectively at lower signal-to-background ratios than previous superresolution imaging and single-particle tracking techniques. But the use of fcsSOFI analysis has been limited to these select applications by the developers Kisley and Landes, et al. fcsSOFI has several limitations which hinder its use to more diverse datasets and by other research groups. fcsSOFI is most significantly limited by its computational expense. The least-squares curve fitting step requires overnight analysis times for typical microscopy images on a desktop computer. Ideally, fcsSOFI would be accessible without the need for high performance computing. Analysis of diffusion modes beyond simple Brownian diffusion, such as multi-component or anomalous diffusion, requires the explicit reprogramming of the FCS step: the fit model function and its parameters must be redefined, and the pixel-by-pixel least-squares curve fitting loop must be edited appropriately. In addition, the MATLAB analysis code is scattered in multiple scripts and functions and lacks user-friendliness.
Here, we have built and distributed an updated fcsSOFI software package which significantly improves upon the speed, scope, and accessibility of the original, while providing additional, biophysically-relevant diffusion models. First, following a widespread trend in computational and data science [28], we accelerate fcsSOFI by parallelizing the leastsquares curve fitting step on a GPU. We then equip fcsSOFI with an expanded list of fitting models which allow for the analysis of experimentally-relevant modes of diffusion. We then apply fcsSOFI to simulated datasets of diffusing proteins in polyacrylamide images and demonstrate fcsSOFI's ability to accurately characterize pore size and reveal non-Brownian diffusion dynamics due to confinement. Lastly, we package our updated software in a flexible, modular, and user-friendly graphical user interface (GUI), and distribute the accelerated code and GUI as an open-source tool via GitHub. With our updated fcsSOFI, we maintain previous levels of accuracy and performance while greatly decreasing runtimes, enabling the analysis of multiple datasets on standard computer hardware. We hope the accessible GUI enables a more diverse group of researchers to use fcsSOFI in their own work exploring diffusion within diverse samples of interest.

Computational equipment
MATLAB, the language the original version of fcsSOFI was developed in by Kisley, et al. in 2015 [21], was used to implement the updates to the fcsSOFI software. MATLAB versions 2018a and 2020a were used to perform the updates. The open-source software build manager CMake (version 3.15.0) and Microsoft Visual Studio 2019 were used to incorporate GPUparallelized curve fitting into fcsSOFI by rebuilding the MATLAB binding of the opensource curve fitting library Gpufit [26]. Execution speeds were benchmarked on two sets of computer hardware: a 2019 Dell Precision 3630 desktop equipped with a 3.7 GHz Intel Core i7 8th Gen CPU, an 8 GB Nvidia Quadro P4000 GPU, and 32 GB of RAM; and a 2017 Dell XPS 15 laptop equipped with a 2.8 GHz Intel Core i7 CPU, a 4 GB NVIDIA GeForce GTX 1050 GPU, and 16 GB of RAM.

GPU-parallelized, least squares curve fitting accelerates fcsSOFI computational time
Serial fitting is the most time-intensive step in the original fcsSOFI analysis. The iterative least-squares curve fitting in the FCS step is performed sequentially for each single pixel autocorrelation curve, where one set of parameter choices is evaluated before testing another set of parameters. Each single pixel autocorrelation curve can require many tens or hundreds of iterations to fit (an average of 86 fit iterations per pixel and maximum of 401 iterations were observed for a 30×30×5000 image stack). As a result, the FCS step can be a computationally demanding process, especially with experimentally-sized images, commonly 512 x 512 = 262,144 pixels or larger, with newer cameras containing upwards of 107 pixels [29].
Parallelization with GPU technologies can improve computation times in fcsSOFI. While modern CPUs are designed with compute cores that are, on their own, significantly faster than individual GPU cores, it is uncommon for a CPU to have more than eight cores (our laptop and desktop CPUs contain four and six cores, respectively), whereas commercial GPUs can consist of well over one thousand cores (the Nvidia Quadro P4000 contains 1792 cores). GPU-parallelization is thus most appropriate for tasks that require many small, but mutually independent computations, like iterative fitting. GPU computing has seen wider adoption in computational microscopy as parallelization can accelerate many of the demanding image processing tasks necessary to resolve large amounts of multidimensional microscope data [30]. GPU computing has previously been used to accelerate maximum likelihood estimation and localization-based super-resolution microscopy [31] as well as medicallyrelevant scanning photoacoustic microscopy [32].
We parallelize the pixel-by-pixel least-squares curve fitting step that extracts diffusion information in fcsSOFI with an open-source GPU-based curve fitting method. Gpufit is an open-source, CUDA-based GPU curve fitting package developed by Przybylski et al. in 2017 [26]. The developers of Gpufit demonstrated the efficacy of their GPU curve fitting package with an application to a stochastic optical reconstruction microscopy (STORM) algorithm [26]. Here, Gpufit mediates the computational expense of the original FCS implementation by parallelizing the iterative fitting on a GPU. Since Gpufit was developed for fitting and localizing single-molecule diffraction-limited patterns, the original binding does not include the fit model functions for the various modes of nanoscale diffusion we study. We used the functionality of Gpufit for user-defined custom models in the form of CUDA header files and corresponding model IDs via C programming [26]. Implementing Gpufit maintains the same accuracy and precision as the original MATLAB 'fit' function in the original CPU-based fcsSOFI. In the case of fcsSOFI, instead of performing each fitting iteration sequentially on a CPU, the now parallelized FCS step tests many parameter combinations simultaneously by placing each independent fit iteration on one of the many hundreds of compute cores on a GPU (Fig. 1e).
fcsSOFI using GPU-parallelized fitting can be upwards of 40 times faster than CPU fitting depending on image size. The execution time of the updated, parallelized and original, serial fcsSOFI codes were studied as data size was increased (Fig. 2). An experimental set of 100 nm fluorescent beads-diffusing in agarose hydrogel data used in the original study on fcsSOFI [21] was segmented into five region-of-interest (ROI) sizes ranging from 32×32 pixels to the original size of 512×512 pixels. All 1000 image frames from the original data set were used for each ROI size, and all five constructed data sets were fed into both GPU-and CPU-based fitting versions of fcsSOFI. Execution times were recorded using the 'tic' and 'toc' functions in MATLAB. For the GPU-parallelized code, three separate trials were taken at each ROI size and averaged. The Dell Precision desktop computer was used for every trial. The updated code significantly outperformed the original code at every image size due to GPU-parallelized fitting: approximately 40 times faster for the 32×32-pixel image and still 10 times faster for the Fig. 2. Parallelized, GPU fitting (blue) speeds fcsSOFI up to 40-times faster than sequential, CPU fitting (red). Total fcsSOFI execution times were measured at five experimental datasets ranging from 32 × 32 pixels to 512 ×512 pixels, 1000 frames long. Error bars (standard deviation) were between 0.05 and 2.5% of the average computation times and are too small to be viewed on the plot. The updated execution times are compared to the execution times of the parts of the code parallelized with Gpufit (green) and are magnified in the inset. Whereas the portions of code which are GPU-specific show a linear increase in computation time for larger image sizes, the entire updated code shows a disproportionate increase.
experimentally sized 512×512-pixel image (Fig. 2). In particular, the 32×32-pixel image required 315.8 s to analyze with the original CPU-based fcsSOFI, an execution time which GPU-parallelized fitting improved to 7.8 ± 0.2 s, on average. Despite the speed increases brought by the GPU-parallelization, the execution time improvement for larger images is limited by the proportion of the code that is not able to be parallelized and must be executed serially by the CPU. Whereas the execution times for the CPU-based code increase linearly as image size increases, which is expected for a proportional increase in computations for larger image sizes, Fig. 2 reveals a nonlinear increase in execution time for the GPU-based code. While this nonlinear trend initially looks unusual, we can look to Amdahl's law for an explanation, which states that the maximum speed improvement resulting from the GPU-parallelization of a given program is constrained by where is the fraction of code that can be parallelized and is the number of GPU cores [33]. Only tasks which consist of many small, independent computations are able to be executed in parallel, which is not the case for the many steps in the fcsSOFI algorithm, such as loading data into the MATLAB workspace, the SOFI analysis, creating figures, and saving data. Thus, the speed increase of the updated fcsSOFI software is limited by its large portion of code that is serially computed on the CPU (Fig 2., inset). Nevertheless, the GPUparallelized code is still significantly faster than the original code at every image size tested.

Additional curve fitting models expand the relevance of fcsSOFI to other diffusion types
An expanded library of curve fitting models in fcsSOFI accounts for different types of diffusion. FCS is performed pixel-by-pixel by treating each pixel as an individual focal volume. The autocorrelation curve of each single pixel intensity trajectory is fit to a known model of diffusion [21]. In the original fcsSOFI code, autocorrelation curves were fit exclusively to a Brownian diffusion model [34], given by Here ( , ), the autocorrelation for a position (pixel) and time lag , is expressed in terms of an amplitude ( ), an offset , and a characteristic diffusion time . In our updated fcsSOFI software, the Brownian model given by Eq. 2 can be swapped with additional diffusion models without explicit reprogramming. The anomalous diffusion model is of particular relevance to diffusion in complex, crowded environments [35,36], and is given by where is the anomalous power where the amplitude and characteristic diffusion time for each component is labeled with a 1 or 2.
Maps of the diffusion coefficient and anomalous power are produced with the curve fits at each pixel. The diffusion coefficient is extracted for each fit from our two-dimensional microscopy data with where is the size of the detection region and is the best fit estimate of the characteristic diffusion time [37]. For fcsSOFI, can be taken as the physical width of each pixel, usually on the order of tens of nanometers. The values of for each pixel are then compiled into a single map where color scaling indicates the value of across the image (Fig. 1i). For the exception of the expanded list of diffusion models used to extract an estimate for , this step is largely unaltered from the original algorithm. However, with the addition of the anomalous diffusion model to updated fcsSOFI pipeline (Eq. 3), similar color maps can be made for the anomalous power (Fig. 1g). The updated fcsSOFI software automatically produces the map along with the FCS map when the anomalous model is selected. If the two-component Brownian diffusion model is selected instead, a second map is produced for the second diffusion constant.

Simulation of biophysical protein diffusion under confined conditions
We produce and analyze simulated data of protein diffusion within hydrogel environments to demonstrate the applicability of the new functionalities in fcsSOFI to biophysically-relevant diffusion in nanostructure materials. Hydrogels are used to mimic the extracellular matrix as supports in tissue engineering and are used as drug delivery vectors [38,39]. From the perspective of proteins, the hydrogel presents local, nanoscale, physicochemical variations in steric confinement and chemistry that can affect the adsorption, diffusion, conformation, folding, and aggregation of the protein [40,41]. The in situ, super-resolution, quantitative capabilities fcsSOFI make it uniquely suited to produce images at the length-scales of hydrogels and the extracellular matrix and the time-scales of protein dynamics.
Environments with clearly segregated void space and gel barriers to simulate diffusion within were created by binarizing and morphologically altering existing SEM images of polyacrylamide from Lira et al. [42]. To account for uneven illumination in the SEM image, Otsu's method, an adaptive algorithm, was used to binarize the image. Combinations of the built-in MATLAB morphological functions, 'imclose,' 'imopen,' 'imdilate,' and 'imerode,' were applied to the binarized image to mimic denser and more open polyacrylamide gels that emulate extracellular matrix from different areas of the body [43]. These morphologically altered binarizations (dense and open matrices), along with an original, unaltered binarization (medium matrix), were used as the environments for diffusion (Fig. 3a-c).
To simulate biophysically-relevant diffusion, 2D random walks were simulated within the nanocavities of the pore maps using step size distributions based on experimentallyobtained parameters (Appendix B). Fluorescently-labeled fibrinogen immobilized on glass was measured experimentally on a wide-field total internal reflection fluorescence (TIRF) microscope (Appendix C). The experimental values of the protein intensity and background were used in the simulation. Emitters were represented by 2D Gaussian point spread functions with a full width half max of 302 nm, and were simulated diffusing at = 1 × 10 [44]. Each emitter's intensity was drawn from a Poisson distribution to simulate shot noise, while simulated readout noise was added using a random normal distribution. An experimentally-determined signal to background ratio of 1.07 was used. Simulations were performed with a frame rate of 100 Hz and a pixel size of 47.6 nm [42]. Each step of the random walk was calculated based on the diffusion coefficient, time step, pixel size, and distribution width. It was then segmented into 1000 smaller substeps, and added onto the previous position substep by substep until the particle collided with a barrier or completed the full step. Particles colliding with barriers had their step sizes truncated at the collision point. Simulations of the Brownian diffusion of 300 emitters were recorded for 5000 frames and analyzed with fcsSOFI.
Super-resolved diffusion maps extracted by fcsSOFI show the local diffusion coefficients of the simulated emitters at the nanoscale (Fig. 3d-f). The ensemble averaged diffusion coefficient in the medium matrix is = (5 ± 1) × 10 compared to the simulated = 1 × 10 . Examining local diffusion behavior below the diffraction limit reveals that while diffusive behavior in the centers of larger pores is around = 1 × 10 (Fig. 3e, green arrow), diffusion coefficients near the edge of the pores, where emitters have a chance to move in the direction of the wall and immediately collide with the barrier, effectively lower the diffusion coefficient to around = 5 × 10 (Fig. 3e, blue arrow). Furthermore, in the smaller pores and other more confined regions, emitters become trapped, with diffusion coefficients around = 1 × 10 (Fig. 3e, red arrow). Trapped particles with lower diffusion coefficients were observed more frequently in denser matrices. The denser the matrix, the more pores there are with sizes comparable to the diameter of emitters. This results in some particles being trapped in pores that heavily restrict their step sizes, hence very low diffusion coefficients.
Anomalous factor maps (α-maps) extracted by fcsSOFI show the sub-diffraction limit effects of confinement on the local anomaleity of the simulated emitters. The ensemble averaged anomaleity, α, in the unaltered binarized SEM image was = 0.87 ± 0.06 compared to the = 1 expected of Brownian diffusion. Near the edges of larger pores, anomaleities around = 1 (Fig. 3f, blue arrow) are observed as once a particle moves in a direction away from a barrier, they are likely to be able to take a full step. In the centers of pores, anomaleities < 1 (Fig. 3f, green arrow) are observed as step lengths are more likely to be cut short due to barriers being nearer to the emitter in every direction. The heterogeneities in both the diffusion coefficients and anomaleities appear in features below the diffraction limit, emphasizing the importance of the super-resolution capabilities of fcsSOFI. Fig. 3. fcsSOFI extracts super-resolved maps of the diffusion dynamics of simulated fibrinogen diffusing in three densities of polyacrylamide gels. Ground truth images (a-c) were generated by coloring the binarized image used in the simulations with the color corresponding to the simulated diffusion coefficient, = 1 × 10 . Superresolved maps of diffusion coefficient (d-f) and anomaleity (g-i) were extracted by fcsSOFI. We observe the effects of confinement (e, h: green and blue arrows) and trapped diffusion (e: red arrow). Cumulative distribution of pore sizes for each matrix density verify fcsSOFI's ability to resolve individual nanopores and produce accurate pore size distributions. Scale bars are 2 μm.
Accurate pore size distributions were extracted by fcsSOFI. The super-resolution images ( Fig. 3e-f) were compared to the ground-truth binarized and morphologically-altered SEM images used as the diffusion simulation environments (Fig. 3a-c). Pore sizes were extracted from both sets of images using Delauney triangulation [45] and cumulative distribution functions (CDFs) were generated to compare the pore size distributions (Fig. 3j-l). The overlap in the simulated and ground-truth CDFs is high, demonstrating the accuracy of fcsSOFI for spatial analysis. Notably, some isolated pores are missing from the analyzed image in cases where a pore was not sampled by a fibronogen molecule over the course of the simulation. This is likely to occur experimentally, not just in simulation, as fluorophores introduced to a porous medium have little chance of entering completely isolated pores.

Development of open-source fcsSOFI GUI
With the goal of making fcsSOFI accessible to wide array of researchers, we present our updated fcsSOFI code both as an easy-to-use GUI and modular MATLAB script. Accessible via MATLAB or the MATLAB compiler, the GUI enables fellow researchers to access the full functionality of fcsSOFI without having to interface with the source code or conduct any independent programming. The GUI contains clearly labeled text inputs, switches, and buttons for data input, diffusion model selection, data inspection, post-processing, figure modification, and exporting data. The GPU-parallelized fcsSOFI software is also available as a standalone MATLAB script for experienced programmers to have finer control on the analysis pipeline. Both the script and GUI automatically pair with the same custom Gpufit library containing the additional diffusion models, and are written to have as similar functionality as possible to accommodate varying preference and experience.
The fcsSOFI GUI, script, and customized Gpufit binding are openly distributed via GitHub [46]. Included in the GitHub distribution is a PDF user-guide for the GUI containing thorough user instructions and troubleshooting tips, as well as many orienting screenshots and diagrams. Researchers with interests in nanoscale diffusion and structure, super-resolution microscopy, and single particle tracking are able to freely download and use fcsSOFI on their own projects. While we anticipate our GUI to make fcsSOFI more accessible to researchers with limited coding skills, researchers who program frequently will be able to provide valuable feedback, feedback which can inform future updates to fcsSOFI. In addition, the fcsSOFI GUI and script will continue to be maintained by our lab with Git version control and updates will be pushed the public GitHub repository as the updated fcsSOFI software is further refined.

Conclusions
The nanoscale environments of porous, biophysical and material systems have macroscale implications on diffusion dynamics within them. We introduced an updated version of fcsSOFI, a correlation-based super-resolution technique that has previously been shown to be able to extract nanoscale structure and Brownian diffusion dynamics with improved speed, scope, and usability. The speed was improved by parallelizing the least-squares curve fitting step on a GPU, and achieve improved computation times by up to a factor of 40. Anomalous diffusion and two-component Brownian diffusion models were added to broaden the scope of biophysically-relevant datasets that can be analyzed using fcsSOFI. fcsSOFI was also packaged into a user-friendly GUI which can easily be outfitted with new diffusion models by the user. The application of fcsSOFI to simulations of fibrinogen diffusing in various matrix densities of polyacrylamide revealed more densely packed pores increased subdiffusive effects, while demonstrating the expanded biophysical scope of fcsSOFI. We hope that the updates in speed, scope, and usability allow a more diverse group of researchers to incorporate fcsSOFI into their research in topics like the extracellular matrix [5, 47,48], chromatographic media [4,49], biofilms [50], cell membranes and cytosol [51,52], catalysis [53,54], and polymers [55].

Appendix A fcsSOFI accurately characterizes anomalous diffusion dynamics
Datasets of diffusion at = 1 × 10 (slower diffusion and simulation code modifications made to minimize confinement effects) with variable anomalous factors were generated to demonstrate the ability of fcsSOFI to accurately extract α. All averaged D and α values were comfortably within error of the values used to simulate the datasets.

Appendix B Limited simulation complexity
Simulated datasets of fibrinogen diffusion were created as a simple case to demonstrate the efficacy of fcsSOFI as a super-resolution tool. Thus, it does not account for protein structure (each fibrinogen molecule is treated as a sphere), fibrinogen-fibrinogen and fibrinogen-matrix interactions, or the potential dynamic nature of the matrix. Experimental data of immobilized fibrinogen molecules was used to select experimentally plausible values for signal-to-noise ratio (SNR).