Verifying molecular clusters by 2-color localization microscopy and significance testing

Arnold, Andreas M.; Schneider, Magdalena C.; Hüsson, Christoph; Sablatnig, Robert; Brameshuber, Mario; Baumgart, Florian; Schütz, Gerhard J.

doi:10.1038/s41598-020-60976-6

Download PDF

Article
Open access
Published: 06 March 2020

Verifying molecular clusters by 2-color localization microscopy and significance testing

Andreas M. Arnold¹^na1,
Magdalena C. Schneider¹^na1,
Christoph Hüsson²,
Robert Sablatnig²,
Mario Brameshuber¹,
Florian Baumgart¹ &
…
Gerhard J. Schütz¹

Scientific Reports volume 10, Article number: 4230 (2020) Cite this article

1902 Accesses
10 Citations
5 Altmetric
Metrics details

Subjects

Abstract

While single-molecule localization microscopy (SMLM) offers the invaluable prospect to visualize cellular structures below the diffraction limit of light microscopy, its potential has not yet been fully capitalized due to its inherent susceptibility to blinking artifacts. Particularly, overcounting of single molecule localizations has impeded a reliable and sensitive detection of biomolecular nanoclusters. Here we introduce a 2-Color Localization microscopy And Significance Testing Approach (2-CLASTA), providing a parameter-free statistical framework for the qualitative analysis of two-dimensional SMLM data via significance testing methods. 2-CLASTA yields p-values for the null hypothesis of random biomolecular distributions, independent of the blinking behavior of the chosen fluorescent labels. The method is parameter-free and does not require any additional measurements nor grouping of localizations. We validated the method both by computer simulations as well as experimentally, using protein concatemers as a mimicry of biomolecular clustering. As the new approach is not affected by overcounting artifacts, it is able to detect biomolecular clustering of various shapes at high sensitivity down to a level of dimers.

A tessellation-based colocalization analysis approach for single-molecule localization microscopy

Article Open access 30 May 2019

A framework for evaluating the performance of SMLM cluster analysis algorithms

Article 10 February 2023

Unbiased choice of global clustering parameters for single-molecule localization microscopy

Article Open access 29 December 2022

Introduction

Single Molecule Localization Microscopy (SMLM) has boosted our insights into cellular structures below the diffraction limit of light microscopy¹. Common to all SMLM variants is the stochastic switching of single dye molecules between a bright and a dark state. Conditions are chosen such that only a marginal portion of the molecules is in the bright state, so that single molecule signals are well separated on each frame. The final superresolution image is reconstructed from the localizations of all single molecule signals.

Researchers have been particularly intrigued by the possibility to determine the spatial distribution of biomolecules in their natural environment, in most cases the intact cell. For example, models for cellular signaling are crucially affected by the spatial organization of receptor and downstream signaling molecules at the plasma membrane^2,3. Application of SMLM to various plasma membrane proteins revealed the presence of nanoclusters to different degrees⁴. More recently concerns were raised that the stochastic activation process of the fluorophores, along with the presence of more than one dye molecule per labeled biomolecule, may lead to multiple observations of the same biomolecule in the superresolution image^5,6. Different attempts were undertaken to approach this problem^{5,7,8,9,10,11}, e.g. by merging localization bursts into one localization¹², by analyzing the number of blinking events per localization cluster^10,11, or by evaluating the spatial spread of the localization clusters⁷. A disadvantage of existing methods is the requirement of user-defined parameters^7,12 or additional experiments to characterize the blinking statistics of the chosen fluorophores^10,11. We recently developed a parameter-free method to identify global protein clustering based on a label titration approach⁸ (see also Spahn et al.⁹), however, in case of faint bimolecular clustering the discrimination is difficult and rather subjective. Taken together, it would be helpful to provide a parameter-free quantitative assessment for the reliability of the statement, whether biomolecular nanoclusters occur in an image or not.

Here we present a method to assess biomolecular nanoclustering in two-dimensional SMLM via p-values in the framework of statistical significance tests, termed 2-Color Localization microscopy And Significance Testing Approach (2-CLASTA). The idea is to target the same biomolecule of interest with different fluorescent labels, determine the localizations in the respective color channels and calculate the nearest neighbor distances between them. The test compares the nearest neighbor distances for the recorded data with the distances for a randomized data set calculated from the measured data. As an output, the method provides a p-value for the null hypothesis that the experimental data set corresponds to an underlying biomolecular distribution, which is not significantly different from a completely random distribution as described by a spatial Poisson process. In this respect, 2-CLASTA differs from existing quantitative approaches, which typically aim at determining quantitative parameters before actually testing the mere presence of biomolecular clusters. The method is parameter-free and does not require any additional measurements nor grouping of localizations. We validated the method experimentally in cells expressing artificially clustered proteins by showing that sizes down to 2 molecules per cluster can be reliably detected.

Results

Testing the null hypothesis of a random biomolecular distribution

Labeling the biomolecule of interest in two different colors yields different two-color SMLM images for a random versus a clustered biomolecular distribution (Fig. 1a). Both images show clear clustering of localizations in each of the color channels due to multiple observations of single dye molecules. The localization clusters of different color, however, correlate only in case of an underlying clustered distribution of biomolecules. As a quantitative measure of this correlation we used the empirical cumulative distribution function \(cdf\) of the nearest neighbor distance \(r\) between the localizations of the two different color channels. Importantly though, \(cdf(r)\) not only depends on the spatial distribution of the labeled biomolecule. Particularly, the blinking statistics of the fluorophore and the number of dye molecules conjugated to the biomolecule of interest affect the distribution functions. Since these parameters are commonly unknown, the different contributions to \(cdf(r)\) are difficult to disentangle.

To analyze the data, we hence opted for a strategy which is independent of prior information on label properties. The idea is to determine a randomized distribution function \(cd{f}_{rand}(r)\) for a scenario in which correlations between the two color channels are broken, by directly using the experimental data contained in the original SMLM recording. Our approach is similar to a goodness-of-fit test, in which the experimental data are compared with Monte Carlo-simulated control data sets using a global deviation measure for calculation of a p-value¹³.

In order to construct a randomized two color data set we transformed the localizations of one color channel and calculated their nearest neighbor distances to the untransformed localizations of the other color channel. As transformation we used a so-called toroidal shift¹⁴, which denotes a shift by an arbitrary distance in an arbitrary direction, assuming periodic boundary conditions (Fig. 1b). This breaks potential correlations between the two color channels while conserving the univariate pattern characteristics. The resulting \(cd{f}_{rand}(r)\) implicitly accounts for the correct blinking statistics and degree of labeling, and can hence be taken as ground truth for the situation of two uncorrelated images, irrespective of their univariate clustering that may be present in each color channel itself. Ideally, for a completely random protein distribution the cumulative density functions are equal (\(cdf(r)=cd{f}_{rand}(r)\)), whereas for a non-random distribution they are not (\(cdf(r)\ne cd{f}_{rand}(r)\)). Note that \(cd{f}_{rand}(r)\) does not need to correspond to a truly random distribution of molecules.

For the statistical assessment, we compared the original empirical \(cdf(r)\) with a set of N = 99 realizations of \(cd{f}_{rand,i}\) (\(i=1,\ldots ,N\)) for random choices of the toroidal shift vector \(\overrightarrow{v}\) (Fig. 1c) and ranked the summary statistics \({g}_{data}\) of the original curve (green) with respect to the control curves (gray) (see Methods). Since we are interested in nanoclustering of biomolecules, we determined the one-sided p-value by ranking the original \(cdf(r)\) with respect to all calculated \(cd{f}_{rand,i}\); the rank is measured in descending order (note that the method also allows for assessing biomolecular repulsion by calculating the rank in ascending order). In practice, prior knowledge on cluster sizes can be taken into account e.g. by constraining the analysis to short distances. For this, we introduced a parameter \({r}_{max}\), which should be chosen close to the sum of the localization errors and the expected cluster size. Here we ignored prior knowledge and set \({r}_{max}\) to the maximum occurring distance, if not mentioned otherwise.

Naturally, the p-value as defined here is limited to discrete numbers with steps of \(\frac{1}{N+1}\), which also defines the minimum p-value obtainable with this method. As expected, the p-value is uniformly distributed in the interval \([0,1]\), when testing realizations of the null hypothesis against the null hypothesis itself (Fig. S2). Hence, this p-value allows for the correct interpretation of the significance level α as the probability of falsely rejecting the null hypothesis. α can also be interpreted as the inevitable false positive rate for the erroneous detection of overcounting-induced clustering for a random distribution of biomolecules. Taken together, by offering an appropriate significance test, 2-CLASTA is hardly susceptible to the inadvertent interpretation of localization clusters as biomolecular nanoclusters.

Sufficient sensitivity to detect even faint spatial biomolecular clustering is crucial to this test. We assessed the sensitivity (also frequently termed power) of 2-CLASTA for two clustering scenarios: i) biomolecular oligomerization (dimers, trimers, and tetramers), and ii) spatially extended clusters with varying load. The spatial distribution of the biomolecules and the according localization maps were generated with Monte Carlo simulations and evaluated with 2-CLASTA. We quantified the test performance via the sensitivity defined as \(sensitivity=\frac{tp}{tp+fn}\), with \(tp\) denoting the true positives (here defined as correctly detected clustering) and \(fn\) the false negatives (here defined as erroneously missed clustering). We used a significance level α = 0.05 in the following.

Sensitivity to detect biomolecular oligomerization

We first assessed the sensitivity of 2-CLASTA to detected different degrees of oligomerization. For this, we simulated 10 × 10 µm² sized localization maps containing randomly distributed dimers, trimers, or tetramers, assigned labels of the two colors with the according blinking statistics and added localization errors. Each localization map can be considered as a realization of a two-color superresolution experiment. The localization maps were analyzed by 2-CLASTA, yielding a p-value for each localization map and the sensitivity for each parameter set. We showcased the performance of the method with a simulated “ideal” scenario, which lacks the presence of unspecific signals. We further assumed a global degree of labeling of 100%, i.e. each biomolecule was represented either by a blue or a red label, yielding localizations according to the experimentally derived blinking statistics for SNAP₄₈₈ and SNAP₆₄₇, respectively (Fig. S1). If not specified otherwise, we simulated a balanced labeling ratio for the two colors. Unspecifically bound fluorophores and background signals may be present in the final localization maps, which may affect the obtained statistics. We hence also analyzed a more “realistic” scenario, for which we added 5 unspecifically bound labels per µm² in each color channel, and 1 or 2 unspecific background signals per µm² in the red or blue color channel, respectively; the characteristics of the unspecific background signals were experimentally determined on unstained cells. For the “realistic” case, we further assumed a reduced global degree of labeling of 40%. All simulation parameters are listed in the Materials and Methods section under the subheading Simulations.

Firstly, we were interested in the total number of biomolecules per image that are required for a reliable detection of oligomerization. Already low numbers of biomolecules of ~1,000 per image (corresponding to 10 molecules per µm²) allow for a sensitive detection even of dimerization, both for the “ideal” and the “realistic” scenario (Fig. 2a). As expected, the sensitivity is somewhat reduced with decreasing degree of oligomerization: this is a consequence of the reduced fraction of oligomers carrying two different labels, particularly for the “realistic” scenario. For the following simulations, we used 7,500 molecules per image (75 molecules/µm²). We next tested the influence of a reduced labeling efficiency. In general, sensitivity was found to be high even down to a labeling degree of ~20% (Fig. 2b). To test the influence of different blinking statistics or of multiple dye molecules per label, we compared the simulation results obtained with the blinking statistics of the pair SNAP₄₈₈ and SNAP₆₄₇ with results obtained using the blinking behavior of the pair PS-CFP2 (blue channel) and an Alexa Fluor 647-conjugated antibody (KT3₆₄₇, red channel)¹⁵, yielding virtually identical curves (Fig. S3). We also tested artificial blinking statistics assuming log-normal distributed numbers of localizations for each label¹⁶. Even for the rather extreme case of ten-fold difference in the mean number of localizations per biomolecule we observed only minor effects on the sensitivity of our method.

Importantly, all blinking statistics were obtained under low labelling conditions in cells and hence adequately reflect the variability in the local environments of the dye molecules.

In a real-life experiment, differently colored ligands may have different affinities for the target biomolecule, leading to unbalanced labelling. While this can be compensated experimentally by adjusting the concentrations of the two labels, we found the sensitivity to remain high also for unbalanced labeling (Fig. 2c). Next, we also tested the influence of randomly distributed unspecific labels added onto the simulated oligomer distributions, yielding only marginal influences (Fig. S4a). Further, the magnitude of the localization errors hardly affects the obtained results (Fig. S4b). Interestingly, an assessment of the influence of stage drift showed that drift-velocities of up to 500 nm over 10.000 frames hardly affected the test sensitivity (Fig. 2d). This is not unexpected, as moderate drift hardly diminishes the correlations between the two color channels in an experiment performed at alternating laser excitation. Moderate chromatic aberrations have a negligible effect on the sensitivity of 2-CLASTA (Fig. S4c,d). We also evaluated the influence of different values of \({r}_{max}\) on the sensitivity of the method, yielding only minor effects (Fig. S5). Finally, we compared different test statistics in their sensitivity to detect dimers. Using k-nearest neighbor statistics (k = 3,5,10) or L_cross statistics¹⁷ did not improve sensitivity (Fig. S6); note that L_cross statistics is not parameter-free.

Sensitivity to detect areas of enrichment or depletion of biomolecules

As a second realization of a non-random spatial distribution of biomolecules we considered spatially extended circular domains, the centers of which were randomly distributed across a two-dimensional plane. Molecules were placed either inside or outside of the domains, which thereby represent areas enriched or depleted in biomolecules compared to the surface density outside of the domains. To facilitate comparison with our previously published approach^8,15, we used here the same parameter settings for assessing the performance of 2-CLASTA: we varied the domain radius between 20 nm and 150 nm, the domain density between 3 and 20 domains per µm², and the fraction of molecules in domains between 20% and 100% (Fig. S7). The average density of biomolecules across the whole field of view was kept constant at 75 molecules per µm².

In general, virtually all scenarios with a substantial heterogeneity in the lateral distribution of the biomolecule can be detected by 2-CLASTA (Figs. 3a and S8): both biomolecular clustering (top right corner) and exclusion areas (bottom left corner) yield a high level of correctly identified scenarios. In particular, the new method even outperforms our previous approach based on label titration, as can be seen by comparing the new figures with the respective plots from our previous paper (Supplementary Figs. 5 and 6)¹⁵.

The diagonal in Fig. 3a represents scenarios, in which the biomolecular concentration inside the domains is similar to the concentration outside of the domains. In other words, these situations correspond to random distributions of biomolecules, which – if detected – would lead to false positive results. Per definition, a random distribution leads to a false positive rate that is identical to the chosen level of significance (here α = 0.05). Indeed, for scenarios corresponding to identical biomolecular densities (±10%) inside versus outside the domains we obtained sensitivity values close to α, hence reaching the principal limit for analyzing a statistical data set.

Again, we simulated a more “realistic” scenario as defined above, yielding similar results as for the ideal scenarios (Figs. 3a and S9). Also in the case of extended areas of enrichment or depletion the localization error had only marginal influence on the sensitivity (Fig. S10). To assess whether the use of different fluorescent labels with altered photophysical properties affect the results, we repeated the simulations both for the “ideal” and the „realistic“ case using the blinking statistics derived previously for a multi-labelled antibody and the photoactivatable protein PS-CFP2¹⁵, yielding virtually unchanged results (Fig. S11). Finally, we tested the algorithm on rectangular clusters of 80 × 400 nm² size (Fig. 3b), yielding similar sensitivity as for circular domains of the same area coverage. In conclusion, the new approach allows for reliable detection of even faint biomolecular clustering, and is not susceptible to false positives due to overcounting artifacts.

Experimental validation

For experimental validation of the 2-CLASTA approach, we mimicked protein monomers and oligomers by concatemers of SNAP-tags with 1 to 4 subunits. These concatemers were anchored in the plasma membrane of HeLa cells via a glycosyl-phosphatidylinositol- (GPI-) anchor. For example, SNAP-concatemers of 4 SNAP-tag subunits would correspond to tetrameric protein oligomers. Clustering of the GPI anchor per se is not expected^7,8. All experiments were performed at similar labeling densities of SNAP-Surface Alexa Fluor 488 (SNAP₄₈₈) and SNAP-Surface Alexa Fluor 647 (SNAP₆₄₇). dSTORM experiments were performed at alternating excitation, yielding superresolution images of the two color channels (Fig. 4a). For each concatemer, we recorded ≥25 cells, and determined the according p-value for the null-hypothesis of a random protein distribution, as described above (Fig. 4b). For SNAP-monomers, we observed a uniform distribution of p-values in the interval \([0,1]\), hence providing no indication for a non-random distribution. In contrast, dimeric, trimeric, and tetrameric SNAP-constructs yielded clear deviations from a uniform distribution, with a substantial peak at low p-values. This reflects the expected signature for an underlying non-random distribution of SNAP-tags.

There is, however, a non-negligible fraction of cells which show p-values > 0.05, even in the case of oligomeric SNAP constructs. This effect is rather prominent for dimers and decreases with increasing degree of oligomerization. In a practical situation, however, one should note that different cells show different protein expression levels, thereby yielding a variability in the number of molecules within the region of interest. As shown in Fig. 2a, a low number of molecules would reduce the sensitivity for the detection of oligomers, or – in other words – would likely yield a high p-value. Indeed, when plotting the obtained p-value versus the number of localizations obtained per cell, we found a trend for high p-values at low localization numbers, which became more pronounced with increasing degree of oligomerization (Fig. S12). Particularly, for appr. 1,000 molecules per image – corresponding to appr. 5,000 localizations, we expect reduced sensitivity, which agrees with Fig. S12.

Discussion

We present here a parameter-free method to statistically assess the question whether biomolecules are distributed randomly in two dimensions, yielding a p-value as output parameter. As for all SMLM methods, live cell applications are inherently restricted by dynamic cellular processes within the rather long recording times. The method is compatible with most fluorescence labeling techniques, as long as it is ensured that each protein molecule is connected to one color channel only: this includes fluorescent antibodies or nanobodies, tags, or low affinity binders¹⁸.

Up to now, statistical detection of biomolecular nanoclustering demanded for obtaining the localization maps of a truly random protein distribution as a reliable standard for comparison with the experimental data. If such a distribution was available, comparative analysis such as Rényi divergence¹⁹ would be feasible. It turns out, however, that localization maps - as they would result from a truly random biomolecular distribution - are difficult to obtain, particularly since the photophysics of organic dyes often changes with the local environment of the chromophore²⁰. We circumvented the problem by analyzing not the images themselves, but a correlation metric between the localizations of the two color channels (in our case, the nearest neighbor distances). In principle, also other metrics could be used for the significance test (e.g. pair cross-correlation analysis^7,21 or Ripley’s covariate analysis¹⁷), especially for testing deviations on length scales beyond the nearest neighbors.

The chosen analysis strategy based on correlation metrics offers the advantage that potential correlations between the two color channels can be deliberately broken, here by applying a toroidal shift to one of the two color channels. By this, the univariate spatial structure of each localization pattern is conserved, while possible correlations are removed. This provides the possibility of significance testing between the original data and the randomized control data sets as an additional advantage.

To make the method immediately applicable, we provide a plugin for ImageJ (see Supporting Material). The experimental basis is a chromatically corrected two-color SMLM data-set analyzed by standard single molecule localization tools²².

In the following, we give a brief discussion on the strengths and potential pitfalls of our approach:

Strengths

(i)
2-CLASTA is stable against errors in two-color image registration or drift correction. As long as these errors are smaller than typical cross-correlation distances of the two color-channels, the effects on the obtained p-values are marginal.
(ii)
The sensitivity of 2-CLASTA is virtually unaffected by increased localization errors. Such errors do not change the physical separation of the biomolecules, but only distribute the single molecule localizations over larger areas, while preserving the correlation of the two color channels. As the method is essentially sensitive to the biomolecular separation, localization errors have no strong effect on the sensitivity.
(iii)
2-CLASTA is not impaired by blinking dye molecules, and does not require the recording of single molecule blinking statistics (as e.g. in the methods published in refs. ^5,10,11), making it insensitive to overcounting problems. In addition, 2-CLASTA can directly be applied to single images, thereby simplifying experimental efforts compared to our previously published method of label density variation⁸.
(iv)
The sensitivity of 2-CLASTA is not affected by any unknown characteristics of the clusters. No assumptions on cluster parameters (size, shape, occupancy) are required for the test. The test performs well even down to the detection of dimers, reflecting the smallest possible clusters.
(v)
2-CLASTA is stable against real life experimental challenges: A typical experiment contains non-specific localizations, or false negatives as a consequence of insufficient degree of labeling. Also the labeling ratio of the two colors may be unbalanced. We extensively tested the influence of such issues in Monte Carlo simulations, and found that the test is very robust over a wide range of parameters.

Potential pitfalls

The sample topography may influence the obtained results: Without further information, it is reasonable to assume a completely random distribution of biomolecules on a flat two-dimensional surface parallel to the focal plane as the null hypothesis of the test. Randomly distributed biomolecules on an arbitrary two-dimensional manifold, however, may lead to virtual clustering in the projection onto a two-dimensional plane. For example, invaginations of the plasma membrane or height differences near the cell borders will cause the accumulation of the detected positions of membrane proteins in the 2D projection²³ and hence will likely lead to a rejection of the null hypothesis. Also, care should be taken when selecting the region of interest: ideally, a central region of the cell should be chosen for analysis, avoiding cell edges and apparent vesicular structures. In principle, whenever reasonable one may further restrict the region of interest in order to specifically scrutinize subcellular structures (e.g. synapses) for the presence of local biomolecular clustering. This comes, however, at the cost of lower molecule numbers, hence reducing the sensitivity.

Conclusion

Taken together, we believe that the 2-CLASTA approach is well suited for a qualitative assessment of spatial two-dimensional biomolecular distributions, before more sophisticated methods are used to characterize the clustering quantitatively⁶. It thereby addresses an interest by the community to include hypothesis testing in SMLM analysis²⁴. By providing p-values, it makes use of the appropriate statistical parameter to test whether a specific data set is in agreement with a particular hypothesis²⁵. Here, small p-values indicate suspicious deviations from randomness. Large p-values, in contrast, do not indicate any peculiarities in the sample; most notably, they do not prove a spatially random distribution of biomolecules. One should note that care has to be taken when interpreting the results of significance tests^25,26. As a particular example, fishing for data sets with small p-values should be avoided.

A further application of 2-CLASTA is the analysis of co-localization of two different types of biomolecules: in this case, the two colors would be used to target the two different biomolecules. In this paper, we provide the framework to test for biomolecular association: extension towards assessment of biomolecular repulsion is straightforward and described in the Methods section.

Materials and Methods

Cell culture, DNA constructs, and reagents

All chemicals and cell culture supplies were from Sigma if not noted otherwise. All reagents for molecular cloning were from New England Biolabs. HeLa cells were purchased from DSMZ (ACC 57 Lot 23) and cultured in DMEM high glucose medium (D6439) supplemented with 10% fetal bovine serum (F7524) and 1 kU/ml Penicillin-Streptomycin (P4333). All cells were grown in a humidified atmosphere at 37 °C and 5% CO₂.

For transient transfection of HeLa cells with GPI-anchored SNAP concatemers, we fused one or multiple copies of the SNAP_f sequence to the N-terminus of the GPI-anchor signal of the human folate receptor. To this end, we carried out PCR to amplify the SNAP_N9183S sequence from pSNAP_f (N9183S) with >15 nt overhangs complementary to adjacent regions of the following SNAP_f copy. We then used the Gibson assembly Master Mix (E2611) following the supplier’s instructions to iteratively insert multiple consecutive copies of the SNAP_f sequence in frame with the GPI anchor. The resulting colonies were screened by site specific restriction digest using HindIII (R3104) to verify the number of inserted copies.

SNAP-Surface Alexa Fluor 488 (SNAP₄₈₈) and SNAP-Surface Alexa Fluor 647 (SNAP₆₄₇) were from New England BioLabs. Both labels were reconstituted in water-free DMSO (276855) at 10 mg/ml, aliquoted and stored at −20 °C until used.

STORM blinking buffer consisted of PBS, 50 mM β-Mercaptoethylamine (30070), 3% (v/v) OxyFluor (Oxyrase Inc., Mansfield, Ohio, U.S.A.), and 20% (v/v) sodium DL-lactate (L1375)²⁷. The pH was adjusted to 8–8.5 using 1 M NaOH.

Sample preparation

Cells were transfected by reverse transfection using Turbofect (ThermoFisher, R0531) according to the supplier’s instructions with Opti-MEM (Gibco, 31985062) as serum-free growth medium. Briefly, cells were detached from tissue culture flasks using Accutase (A6964). Subsequently, approximately 50,000 cells were mixed with Turbofect-DNA complexes and seeded on fibronectin-coated (F1141) LabTek chambers (Nunc) and incubated overnight. The following day, cells were labeled for 30–45 min in the incubator with 50 nM SNAP₄₈₈ and 1 µM SNAP₆₄₇ diluted in cell culture medium. After labeling, cells were extensively washed with HBSS, and fixed with 4% formaldehyde (Thermo Scientific, R28908) and 0.2% glutaraldehyde (GA) for 30 min at room temperature. After another series of two washing steps, we added 450 µl freshly prepared STORM buffer immediately prior to imaging.

Superresolution microscopy and image reconstruction

A Zeiss Axiovert 200 microscope equipped with a 100x Plan-Apochromat (NA = 1.46) objective (Zeiss) was used for imaging samples in objective-based total internal reflection (TIR) configuration. TIR illumination was achieved by shifting the excitation beam parallel to the optical axis with a mirror mounted on a motorized table. The setup was further equipped with a 640 nm diode laser (Obis640, Coherent), a 405 nm diode laser (iBeam smart 405, Toptica) and a 488 nm diode laser (iBeam smart 488, Toptica). Laser lines were overlaid with an OBIS Galaxy beam combiner (Coherent). Laser intensity and timings were modulated using in-house developed LabVIEW software (National Instruments). To separate emission from excitation light, we used a dichroic mirror (Z488 647 RPC, Chroma). Images were split chromatically into two emission channels using an Optosplit2 (Cairn Research) with a dichroic mirror (DD640-FDi01-25 × 36, Semrock) and additional emission filters for each color channel (690/70 H and FF01-550/88-25, Chroma). All data was recorded on a back-illuminated EM-CCD camera (Andor iXon DU897-DV).

Typically, we recorded sequences of 20 000 frames in alternating excitation mode. Samples were illuminated repeatedly at 640 nm, 405 nm, and 488 nm with 2–3 kW/cm² intensity (640 nm and 488 nm) and 3–5 W/cm² (405 nm); intensities were measured in epi-configuration. We selected the illumination times in ranges of 3 ms–10 ms (640 nm), 3 ms–30 ms (488 nm), and 6 ms (405 nm). Time delays between consecutive illuminations were below 6 ms. The camera was read out after the 640 nm and after the 488 nm illumination, yielding 10 000 frames in each color channel. Thus, the total recording time for a full dataset ranged from 3 to 7 minutes. Only data from those frames were included in the analysis, in which well-separated single molecule signals were observable.

We recorded calibration images of immobilized fluorescent beads after each experiment (TetraSpeck Fluorescent Microspheres, life technologies, T14792) and registered the images as described previously²⁸. Single molecule localization and image reconstruction was performed using the open-source ImageJ plugin ThunderSTORM²⁹.

Quantitative analysis of single label blinking

We quantified single label blinking on HeLa cells expressing GPI-anchored SNAP-monomers, using the identical illumination protocol as for 2-color dSTORM recordings. To assure sufficient separation between individual label molecules, dSTORM experiments were performed at low labeling concentrations of either SNAP₆₄₇ or SNAP₄₈₈. To statistically quantify the blinking of SNAP₆₄₇ and SNAP₄₈₈, localizations from individual label molecules were grouped and quantified in MATLAB (R2019b, The MathWorks Inc., Natick, MA). We determined the first frame of appearance, the total number of detections per label (N) (Fig. S1), the time a label is detectable in consecutive frames (t_on) and the time a label is not detectable (t_off).

Calculation of p-values

We compared the positions of all localizations obtained in the red color channel \(({x}_{red},{y}_{red})\) with those obtained in the blue color channel \(({x}_{blue},{y}_{blue})\). For this, we calculated the distribution of distances, \({\rm{r}}\), from each red localization to the nearest blue localization, and determined the cumulative distribution function \(cdf(r)\). To determine the distribution of nearest neighbor distances under the null model we applied a so-called toroidal shift¹⁴ to the positions of the red color channel \(({\tilde{x}}_{red},{\tilde{y}}_{red})=({x}_{red},{y}_{red})+\overrightarrow{v}\), where \(\overrightarrow{v}=({x}_{shift},{y}_{shift})\) is the shift vector, with periodic boundary conditions set by the region of interest. Repeating this procedure \({\rm{N}}\) times with random shift vectors \(\overrightarrow{v}\,\)chosen uniformly within the region of interest, yielded \(N\) realizations of the null model of a random distribution of biomolecules. The according nearest neighbor distributions for each randomized control \(i\) was determined as described above, yielding \(cd{f}_{rand,i}(r)\). We defined the test statistic as the integral \(g={\int }_{0}^{{r}_{max}}cdf(r)dr\), since it captures the whole spectrum of distances, without bias. If not mentioned otherwise we set \({r}_{max}\) to the maximum nearest neighbor distance occurring during the whole analysis, making the method independent from any user-defined parameter (Fig. S5). To compare the distributions, we computed the value of the test statistic for the original data \({g}_{data}={\int }_{0}^{{r}_{max}}cd{f}_{data}(r)dr\). Calculating the test statistic for all randomized controls yielded the set \({G}_{rand}=\{{g}_{rand,i}={\int }_{0}^{{r}_{max}}cd{f}_{rand,i}(r)dr|i=1,\ldots ,N\}.\) We next determined \(rank({g}_{data},G),\) which is defined as the rank of \({g}_{data}\) within the set union \(G={G}_{rand}{\cup }^{}\{{g}_{data}\}\), where the statistical rank is measured in descending order. Naturally, \(p=\frac{rank({g}_{data},G)}{N+1}\) yields the one-sided p-value³⁰, since for randomly distributed molecules \(\frac{rank({g}_{data},G)}{N+1}\) is uniformly distributed on the interval \([\frac{1}{N+1},1]\) (Fig. S2). The calculated p-value is limited to discrete numbers with steps of \(\frac{1}{N+1}\). Real molecular clustering results in a bias towards lower nearest neighbor distances, yielding low p-values as shown in Fig. 1c. In principle, the method can also be used to test for biomolecular repulsion; in this case, \({g}_{data}\) needs to be ranked within \(g\) in ascending order for calculation of the p-value.

To calculate a p-value from k-nearest neighbor statistics, cumulative distribution functions \(cdf(r)\) for the mean distance of the k-nearest neighbors (k = 3, 5 or 10) were processed and analyzed as described above. When using L_cross as basis of the test statistic, cumulative distribution functions \(cdf(L)\) with \(L={L}_{cross}({r}^{\ast })\) were generated for a radius of \({r}^{\ast }=50\,nm.\) The test statistic \(g\) was then calculated as \(g={\int }_{0}^{{L}_{max}}cdf(L)dL\). Since clustering results in high values of L_cross the \(rank({g}_{data},G)\) was measured in ascending order.

Simulations

Conceptually, simulations were performed as described previously¹⁵. All simulations were carried out in MATLAB (R2019b, The MathWorks Inc., Natick, MA) on a standard personal computer.

First, we simulated the underlying protein distributions for regions of 10 × 10 µm², reflecting approximately the size of a typical cell. For all simulations we used 75 molecules per µm², if not mentioned otherwise.

Simulation of oligomers

We distributed oligomers randomly within the region of interest, and assigned n biomolecules to each n-mer position (n = 1 to 4). A random distribution of biomolecules is naturally reflected by the case of n = 1.

Simulation of areas of enrichment or depletion of biomolecules

Circular domains with a radius of 20, 40, 60, 80, 100 or 150 nm were distributed randomly onto the region of interest with adjustable number of domains per µm² (3, 5, 10, 15, 20 and 25). The number of biomolecules per domain was calculated from the total number of simulated molecules (here 7,500), the fraction of molecules inside domains (20, 40, 60, 80, 100%), and the number of simulated domains, assuming a Poissonian distribution. Biomolecules were distributed randomly within the domains. The remaining molecules were distributed randomly in the areas outside of the domains.

Second, two different types of labels, corresponding to the two colors, were assigned randomly to the molecules according to the specified labeling ratio, assuming Binomial statistics.

Third, to simulate blinking, we assigned a number of detections to each label (Fig. S1). For the simulations shown in Fig. S3 and Fig. S11, we used blinking statistics determined previously¹⁵ or artificial blinking statistics following a log-normal distribution. Localization errors were simulated by spreading these detections using a Gaussian profile centered on the molecule position with a width of 30 nm, which corresponds to typical localization errors achieved in SMLM experiments. We assumed identical localization errors for the two color channels.

Fourth, to account for experimental errors in the “realistic” scenarios, we included unspecifically bound labels at a mean density of 5 labels/µm² for each color channel, assuming the blinking statistics determined for SNAP₄₈₈ and SNAP₆₄₇. We finally considered also false positive localizations by adding a background of 1 (2) signals/µm² for the red (blue) color channel, again with experimentally determined blinking statistics obtained in unlabeled cells.

Fifth, to account for stage drift in Fig. 2d we assumed alternating laser excitation and hence added a global drift vector \(\overrightarrow{d}\) to the localizations of both color channels obtained at time \(t\) according to \(\overrightarrow{x}\to \overrightarrow{x}+\overrightarrow{d}\cdot t\).

Sixth, to account for residual chromatic aberrations in Fig. S4c,d, we displaced every localization of the red color channel by a vector, which was characterized by the vector field \((x\text{'},y\text{'})=\beta \cdot (x-{x}_{0},y-{y}_{0})\); we set \({x}_{0}={y}_{0}=2.5\,{\mu }m\). \(\beta \) was varied between 0 and 0.06.

If not mentioned otherwise, 100 simulations were performed for each experimental condition.

If not mentioned otherwise, we used the following set of parameters: 10 × 10 µm² region of interest, 75 molecules per µm², a balanced labeling ratio between the two color channels, no stage drift, 30 nm localization error (standard deviation), and the blinking statistics determined for SNAP₄₈₈ and SNAP₆₄₇ for the two color channels. For the “ideal” scenario we simulated 100% labeling efficiency, no unspecifically bound labels and no unspecific background signals. For the “realistic” scenario we simulated 40% labeling efficiency, 5 unspecifically bound labels per µm² and color channel, and 1 or 2 unspecific background signals per µm² in the red and blue color channel, respectively.

Data availability

The data that support the findings of this study are available from the corresponding authors upon reasonable request.

References

Schermelleh, L. et al. Super-resolution microscopy demystified. Nat. Cell Biol. 21, 72–84, https://doi.org/10.1038/s41556-018-0251-8 (2019).
Article CAS PubMed Google Scholar
Hartman, N. C. & Groves, J. T. Signaling clusters in the cell membrane. Curr. Opin. Cell Biol. 23, 370–376, https://doi.org/10.1016/j.ceb.2011.05.003 (2011).
Article CAS PubMed PubMed Central Google Scholar
Harding, A. S. & Hancock, J. F. Using plasma membrane nanoclusters to build better signaling circuits. Trends Cell Biol. 18, 364–371, https://doi.org/10.1016/j.tcb.2008.05.006 (2008).
Article CAS PubMed PubMed Central Google Scholar
Garcia-Parajo, M. F., Cambi, A., Torreno-Pina, J. A., Thompson, N. & Jacobson, K. Nanoclustering as a dominant feature of plasma membrane organization. J. Cell Sci. 127, 4995–5005, https://doi.org/10.1242/jcs.146340 (2014).
Article CAS PubMed PubMed Central Google Scholar
Annibale, P., Vanni, S., Scarselli, M., Rothlisberger, U. & Radenovic, A. Identification of clustering artifacts in photoactivated localization microscopy. Nat. Methods 8, 527–528, https://doi.org/10.1038/nmeth.1627 (2011).
Article CAS PubMed Google Scholar
Baumgart, F., Arnold, A., Rossboth, B., Brameshuber, M. & Schütz, G. J. What we talk about when we talk about nanoclusters. Methods Appl. Fluorescence 7, 013001, https://doi.org/10.1088/2050-6120/aaed0f (2019).
Article ADS CAS Google Scholar
Sengupta, P. et al. Probing protein heterogeneity in the plasma membrane using PALM and pair correlation analysis. Nat. Methods 8, 969–975, https://doi.org/10.1038/nmeth.1704 (2011).
Article CAS PubMed PubMed Central Google Scholar
Baumgart, F. et al. Varying label density allows artifact-free analysis of membrane-protein nanoclusters. Nat. Meth 13, 661–664, https://doi.org/10.1038/nmeth.3897 (2016).
Article CAS Google Scholar
Spahn, C., Herrmannsdorfer, F., Kuner, T. & Heilemann, M. Temporal accumulation analysis provides simplified artifact-free analysis of membrane-protein nanoclusters. Nat. Meth 13, 963–964, https://doi.org/10.1038/nmeth.4065 (2016).
Article CAS Google Scholar
Hummer, G., Fricke, F. & Heilemann, M. Model-independent counting of molecules in single-molecule localization microscopy. Mol. Biol. Cell 27, 3637–3644, https://doi.org/10.1091/mbc.E16-07-0525 (2016).
Article CAS PubMed PubMed Central Google Scholar
Ehmann, N. et al. Quantitative super-resolution imaging of Bruchpilot distinguishes active zone states. Nat. Commun. 5, 4650, https://doi.org/10.1038/ncomms5650 (2014).
Article ADS CAS PubMed PubMed Central Google Scholar
Annibale, P., Vanni, S., Scarselli, M., Rothlisberger, U. & Radenovic, A. Quantitative photo activated localization microscopy: unraveling the effects of photoblinking. PLoS ONE 6, e22678, https://doi.org/10.1371/journal.pone.0022678 (2011).
Article ADS CAS PubMed PubMed Central Google Scholar
Illian, J., Penttinen, A., Stoyan, H. & Stoyan, D. Statistical analysis and modelling of spatial point patterns. (John Wiley, 2008).
Lotwick, H. W. & Silverman, B. W. Methods for Analysing Spatial Processes of Several Types of Points. Journal of the Royal Statistical Society. Ser. B 44, 406–413 (1982).
Google Scholar
Rossboth, B. et al. TCRs are randomly distributed on the plasma membrane of resting antigen-experienced T cells. Nat. Immunology 19, 821–827, https://doi.org/10.1038/s41590-018-0162-7 (2018).
Article CAS Google Scholar
Zanacchi, F. C. et al. A DNA origami platform for quantifying protein copy number in super-resolution. Nat. Meth 14, 789–792, https://doi.org/10.1038/nmeth.4342 (2017).
Article CAS Google Scholar
Rossy, J., Cohen, E., Gaus, K. & Owen, D. M. Method for co-cluster analysis in multichannel single-molecule localisation data. Histochem. Cell Biol. 141, 605–612, https://doi.org/10.1007/s00418-014-1208-z (2014).
Article CAS PubMed Google Scholar
Linde, Svd, Heilemann, M. & Sauer, M. Live-Cell Super-Resolution Imaging with Synthetic Fluorophores. Annu. Rev. Phys. Chem. 63, 519–540, https://doi.org/10.1146/annurev-physchem-032811-112012 (2012).
Article ADS CAS PubMed Google Scholar
Staszowska, A. D. et al. The Rényi divergence enables accurate and precise cluster analysis for localization microscopy. Bioinforma. 34, 4102–4111, https://doi.org/10.1093/bioinformatics/bty403 (2018).
Article CAS Google Scholar
Levitus, M. & Ranjit, S. Cyanine dyes in biophysical research: the photophysics of polymethine fluorescent dyes in biomolecular environments. Q. Rev. Biophysics 44, 123–151, https://doi.org/10.1017/S0033583510000247 (2011).
Article CAS Google Scholar
Schnitzbauer, J. et al. Correlation analysis framework for localization-based superresolution microscopy. Proc. Natl Acad. Sci. 115, 3219–3224, https://doi.org/10.1073/pnas.1711314115 (2018).
Article ADS CAS PubMed Google Scholar
Sage, D. et al. Super-resolution fight club: assessment of 2D and 3D single-molecule localization microscopy software. Nat. Methods 16, 387–395, https://doi.org/10.1038/s41592-019-0364-4 (2019).
Article CAS PubMed PubMed Central Google Scholar
Adler, J., Shevchuk, A. I., Novak, P., Korchev, Y. E. & Parmryd, I. Plasma membrane topography and interpretation of single-particle tracks. Nat. Methods 7, 170–171, https://doi.org/10.1038/nmeth0310-170 (2010).
Article CAS PubMed Google Scholar
Baddeley, D. & Bewersdorf, J. Biological Insight from Super-Resolution Microscopy: What We Can Learn from Localization-Based Images. Annu. Rev. Biochem. 87, 965–989, https://doi.org/10.1146/annurev-biochem-060815-014801 (2018).
Article CAS Google Scholar
Altman, N. & Krzywinski, M. Points of Significance: Interpreting P values. Nat. Meth 14, 213–214, https://doi.org/10.1038/nmeth.4210 (2017).
Article CAS Google Scholar
Lakens, D. The Practical Alternative to the P-value Is the Correctly Used P-value. PsyArXiv, https://doi.org/10.31234/osf.io/shm8v (2019).
Nahidiazar, L., Agronskaia, A. V., Broertjes, J., van den Broek, B. & Jalink, K. Optimizing Imaging Conditions for Demanding Multi-Color Super Resolution Localization Microscopy. PLoS ONE 11, e0158884, https://doi.org/10.1371/journal.pone.0158884 (2016).
Article CAS PubMed PubMed Central Google Scholar
Ruprecht, V., Brameshuber, M. & Schütz, G. J. Two-color single molecule tracking combined with photobleaching for the detection of rare molecular interactions in fluid biomembranes. Soft Matter 6, 568–581, https://doi.org/10.1039/b916734j (2010).
Article ADS CAS Google Scholar
Ovesný, M., Křížek, P., Borkovec, J., Švindrych, Z. & Hagen, G. M. ThunderSTORM: a comprehensive ImageJ plugin for PALM and STORM data analysis and super-resolution imaging. Bioinforma. 30, 2389–2390, https://doi.org/10.1093/bioinformatics/btu202 (2014).
Article CAS Google Scholar
Wiegand, T. & Moloney, K. A. Handbook of spatial point-pattern analysis in ecology. (CRC Press, Taylor & Francis Group, 2014).

Download references

Acknowledgements

This work was supported by the Austrian Science Fund with project numbers F 6809-N36 and P 26337-B21 (to G.J.S.), P 27941-E28 (to F.B.), and a DOC Fellowship (24793) from the Austrian Academy of Sciences (A.M.A.). G.J.S. and R.S. acknowledge support by the PhD program “MEIBio (Molecular and Elemental Imaging in Biosciences)” provided by TU Wien.

Author information

These authors contributed equally: Andreas M. Arnold and Magdalena C. Schneider.

Authors and Affiliations

Institute of Applied Physics, TU Wien, Getreidemarkt 9, A-1060, Vienna, Austria
Andreas M. Arnold, Magdalena C. Schneider, Mario Brameshuber, Florian Baumgart & Gerhard J. Schütz
Institute of Visual Computing and Human-Centered Technology, TU Wien, Favoritenstrasse 9-11, A-1040, Vienna, Austria
Christoph Hüsson & Robert Sablatnig

Authors

Andreas M. Arnold
View author publications
You can also search for this author in PubMed Google Scholar
Magdalena C. Schneider
View author publications
You can also search for this author in PubMed Google Scholar
Christoph Hüsson
View author publications
You can also search for this author in PubMed Google Scholar
Robert Sablatnig
View author publications
You can also search for this author in PubMed Google Scholar
Mario Brameshuber
View author publications
You can also search for this author in PubMed Google Scholar
Florian Baumgart
View author publications
You can also search for this author in PubMed Google Scholar
Gerhard J. Schütz
View author publications
You can also search for this author in PubMed Google Scholar

Contributions

A.M.A. and M.C.S. contributed equally. F.B. and G.J.S. conceived the study; A.M.A., F.B., G.J.S. and M.C.S., designed the analytical method; M.C.S. developed the algorithm for p-value calculation; A.M.A. and M.C.S. wrote the code for the analytical methods and the simulations; A.M.A., F.B., M.B. and M.C.S. performed experiments; A.M.A., M.B., and M.C.S. analyzed the data; C.H. wrote the ImageJ plugin; G.J.S. and R.S. directed research; A.M.A., G.J.S. and M.C.S. wrote the manuscript.

Corresponding authors

Correspondence to Florian Baumgart or Gerhard J. Schütz.

Ethics declarations

Competing interests

The authors declare no competing interests.

Additional information

Publisher’s note Springer Nature remains neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Supplementary information

Supplementary Information.

Supplementary Information2.

Rights and permissions

Open Access This article is licensed under a Creative Commons Attribution 4.0 International License, which permits use, sharing, adaptation, distribution and reproduction in any medium or format, as long as you give appropriate credit to the original author(s) and the source, provide a link to the Creative Commons license, and indicate if changes were made. The images or other third party material in this article are included in the article’s Creative Commons license, unless indicated otherwise in a credit line to the material. If material is not included in the article’s Creative Commons license and your intended use is not permitted by statutory regulation or exceeds the permitted use, you will need to obtain permission directly from the copyright holder. To view a copy of this license, visit http://creativecommons.org/licenses/by/4.0/.

Reprints and permissions

About this article

Cite this article

Arnold, A.M., Schneider, M.C., Hüsson, C. et al. Verifying molecular clusters by 2-color localization microscopy and significance testing. Sci Rep 10, 4230 (2020). https://doi.org/10.1038/s41598-020-60976-6

Download citation

Received: 18 November 2019
Accepted: 17 February 2020
Published: 06 March 2020
DOI: https://doi.org/10.1038/s41598-020-60976-6

This article is cited by

Correction of multiple-blinking artifacts in photoactivated localization microscopy
- Louis G. Jensen
- Tjun Yee Hoh
- Dylan M. Owen
Nature Methods (2022)
Unscrambling fluorophore blinking for comprehensive cluster detection via photoactivated localization microscopy
- René Platzer
- Benedikt K. Rossboth
- Mario Brameshuber
Nature Communications (2020)

Comments

By submitting a comment you agree to abide by our Terms and Community Guidelines. If you find something abusive or that does not comply with our terms or guidelines please flag it as inappropriate.