Molecular contribution function in RESOLFT nanoscopy

: The ultimate objective of a microscope of the highest resolution is to map the molecules of interest in the sample. Traditionally, linear imaging systems are characterized by their spatial frequency transfer function, which is given, in real space, by the point spread function (PSF). By extending the concept of the PSF towards the molecular contribution function (MCF), that quantiﬁes the average contribution of a single ﬂuorophore to the image, a straightforward concept for counting ﬂuorophores is obtained. Using reversible saturable optical ﬂuorescence transitions (RESOLFT), ﬂuorophores are eﬀectively activated only in a small, subdiﬀraction-sized volume before they are read out. During readout the signal exhibits an increased variance due to the stochastic nature of prior activation, which scales quadratically with the brightness of the active ﬂuorophores while the mean of the signal scales only linearly with it. Using a two-state Markov model for the activation, showing comparable behavior to the switching kinetics of the switchable ﬂuorescent protein rsEGFP2, we can approximate quantitatively the MCF of RESOLFT nanoscopy allowing to count the number of ﬂuorophores within a subdiﬀraction-sized region of the sample. The method is validated on measurements of tubulin structures in Drosophila melagonaster larvae. Modeling and estimation of the MCF is a promising approach to quantitative microscopy.


Introduction
Labeling molecules of interest with a fluorescent marker molecule delivers excellent contrast and allows optical readout using far-field fluorescence microscopes.For a long time, the spatial resolution of optical microscopes was limited to about half the wavelength of the used light, but exploiting molecular state properties of fluorophores, super-resolution microscopy was able to bring the resolution down to a few nanometers.
Mapping the number density of fluorophores on the nanoscale, where only a few fluorophores are present in each resolved region, has the potential to significantly advance knowledge about cellular processes.Knowing the absolute numbers of molecules and stoichiometries are the basis for structural models of protein complexes and simulations of biological processes.It also allows to define thresholds on the number of molecules to be able to produce a certain effect or identifying the content of molecule clusters or small compartments on a quantitative basis.For example, the estimation of the number of constituent proteins in kinetochores reported unexpected high numbers of proteins present [1], while the quantification of numbers of proteins used for the regeneration of flagella helped refine models for flagella assembly [2].Cluster of Tom20 proteins in the mitochondrial outer membrane seem to have a fixed diameter suggesting an equal number of proteins in each cluster, even though the nanoscale distribution of the clusters is finely adjusted to growth conditions [3].Further knowledge of the number of Tom20 on the nanoscale would support elucidating the regulatory mechanism of the distribution of these proteins.However, quantifying numbers of fluorophores requires a careful calibration of the contribution of each fluorophore to the microscopic image.
All fluorescence nanoscopy or super-resolution microscopy methods rely on consecutively bringing the fluorophores in bright signaling (on) and dark non-signaling (off) states.The state transitions can be elicited either in a spatially controlled or in a spatially stochastic manner.The first strategy is realized in methods called stimulated emission depletion [4,5] (STED), saturated structured-illumination microscopy [6] (SSIM) and reversible saturable optical fluorescence transition [7,8] (RESOLFT).In these methods, an intensity distribution with one or multiple intensity minima drives the molecules optically between an on-and an off-state, transferring the markers to one of these states except those in the vicinity of the minima.The light pattern is then scanned across the sample so that every position is represented in the super-resolved image.Spatially stochastic methods such as photoactivated localization microscopy [9] (PALM) or stochastic optical reconstruction microscopy [10] (STORM) switch adjacent molecules to the on-state one by one.Once activated, they are able to emit multiple fluorescence photons, which are used to identify the position of the molecule.
Spatially stochastic methods intrinsically contain information about single fluorophore contributions.However, counting the molecules is not straightforward.Molecules in the on-state may not emit enough photons to be recognized.Some molecules may not assume this state at all, while other molecules may occupy the on-state repeatedly and be counted multiple times.These effects require extensive modelling of the switching and imaging behavior to be able to count the number of fluorescent molecules in the sample [11][12][13].
In spatially controlled nanoscopy like STED or RESOLFT, the mean obtained signal is proportional to the number of active fluorophores.The signal from a fluorophore, which determines the molecular contribution function (MCF), is measured in detected photons per fluorophore during readout.One approach to calibrate the average signal from each fluorophore would be to use a sample with a known label density, and to compare this standard to the sample of interest.However, with the brightness per fluorophore being sensitive to the kinetics of the state transition parameters, and different optimal imaging parameters like dwell times chosen in each experiment, an intrinsic calibration of the fluorophore in the imaging process is preferable.The average signal can also be calibrated from single bleaching steps, provided that the signal from single fluorophores is detectable [14].In the case of diffusing fluorophores, the average signal of molecules can be estimated intrinsically using the mean and the variance of the fluctuating signal [15].For example, the mean and variance method was utilized to estimate the brightness in combined STED and fluorescence correlation spectroscopy (FCS) experiments for diffusing probes [16].In imaging applications, photon antibunching, which is the fact that a single excited fluorophore can only emit a single photon at a time, and a stochastic model of the image formation process could estimate the number of molecules from a combination of confocal and STED recordings without any external calibration of the brightness per molecule [17].
RESOLFT microscopy based on reversibly switchable fluorescent proteins (RSFP) is able to achieve superresolution using low light levels [18][19][20].Here, the diffraction limit is broken using long-lived dark states, such as the dark isomer of reversibly switchable fluorescent proteins.As these proteins emit fewer photons in their on-state isomer than the synthetic dyes used for STED microscopy, bleaching steps cannot be easily detected and the photon antibunching effect would be rather weak.
Here, a novel method for the intrinsic estimation of the brightness of RSFPs is developed.We propose a model for the image formation process in RESOLFT microscopy.The image is explained by the convolution of the fluorophore distribution with the MCF, which describes the average contribution of a single fluorophore to the image in a linear imaging system, in an un-normalized, quantitative way and directly connects number densities with collected signal levels.Knowledge of the shape and amplitude of the MCF provides a direct method to count the number of molecules in a highly resolved image.
Based on a suitable switching model for the fluorophores, we approximate the shape of the MCF.Free model parameters are calibrated from the image data itself using a data analysis similar to the mean and variance method of [15] but applicable to immobile or only slowly changing molecule arrangements.The calibrated MCF is then applied to estimate the number density in RESOLFT images of microtubule filaments in Drosophila melagonaster larvae where an independent biochemical estimation of the number density is available.

The molecular contribution function (MCF) for counting in fluorescence imaging
A fluorescence image contains the contribution of each fluorescent molecule in the sample.In analogy to the point spread function, which is the image of a single theoretical point source, we define the molecular contribution function (MCF) as the quantitative spatial distribution of the average number of photons counted from a single fluorophore.We assume a space-invariant, linear imaging system where identical fluorophores act independently.Denoting the sampled region by X ⊂ R 2 , the measured 2D image Y(ì x) at each scan position ì x ∈ X is then given by the convolution of the number density n(ì x) with the MCF(ì x) where * denotes the convolution operator (f * g)(ì x) = ∫ f (ì s)g(ì x − ì s) dì s and ε(ì x) describes the measurement noise, i.e. the deviation of a recorded image from the average image.The number density is represented by where δ is the Dirac delta, n 0 the total number of fluorophores, and {ì x j } j=1,...,n 0 are the positions of the fluorophores within the sampled region X.The aim of this work is to count the number of molecules in regions as small as the size of the resolution limit, i.e. the width of the central peak of the MCF.This requires the quantitative knowledge of the MCF.Note that every fluorescence microscopy technique with linear imaging conditions and independent and identically behaving fluorophores features an MCF, which lends itself to mapping the number of fluorophores.The integral over the MCF gives the average total signal S that each fluorophore contributes to the image, The estimated number of fluorophores ni in an isolated region X i is then simply estimated by dividing the signal integrated over X i by the total signal per fluorophore, A sketch of the role of the MCF in estimating the number of fluorophores is shown in Fig. 1.In the following we refine this framework by deriving analytic expressions of the MCF of RESOLFT nanoscopy for a specific switching model, and by estimating S using the mean and the variance of the measured signal.

The MCF of RESOLFT nanoscopy
The RESOLFT principle is based on using switchable fluorophores with on-and off-states.We assume that they are independently switching and emitting fluorescence.For simplicity, we restrict ourselves to positive imaging, i.e. the MCF features a sharp, sub-diffraction sized positive signal peak.
For each scanning position ì s, two conceptual steps can be differentiated.At first, fluorophores are transferred from the off-state, in which they reside initially, into the on-state only within a sharp, subdiffraction sized region centered at ì s.In practice, this is achieved by first activating fluorophores with a diffraction-limited focus, and then deactivating fluorophores in the periphery using a doughnut-shaped focus with a central intensity zero.The spatial distribution of the probability of a single fluorophore to be effectively activated is named p act (ì x).Activated fluorophores are then read out recording their fluorescence signal.The spatial distribution of the recorded fluorescence photon signal of a single activated fluorophore during readout time is s read (ì x), typically proportional to the confocal point spread function (PSF).The MCF of RESOLFT nanoscopy is therefore given by the readout signal multiplied by the activation probability, MCF(ì x) = p act (ì x)s read (ì x) .
In the following, we derive expressions for the two parts of the MCF within the focal plane assuming switching kinetics represented by a two-state Markov model, strong saturation of the off-switching transition while preparing the fluorophore in the on-state, a parabolic intensity dependence of the deactivation light, and a sufficiently short readout duration.

Effective activation by saturated off-switching
The effective activation step preceding the readout depends on the switching kinetics of the fluorophore.A simple model for the switching kinetics is the two-state Markov model with light driven rates depending linearly on the applied light intensity (see Fig. 2a).The evolution of the state populations for this model is laid out in Appendix B.
The effective activation of a molecule in the on-state in RESOLFT nanoscopy is typically the result of two sub-steps [18].First, a diffraction-limited spot of activation light is applied.The wavelength of the activation light is chosen so that the on-switching cross section is larger than the off-switching cross section.Consequently, fluorophores are switched on in the focal region.Then, fluorophores in the periphery of the focus are switched off using a doughnut-shaped focal spot of deactivation light centered at the scan position.For the wavelength of the deactivation light, the off-switching cross-section is larger than the on-switching cross section.To achieve a sub-diffraction sized region, the off-switching by the deactivation light is saturated.
For the two-state switching model, the effective population in the on-state after applying the activation and the deactivation light can be calculated by applying Eq. (25) (see Appendix B) two times one after another with suitable switching light cross sections, spatially dependent irradiation doses and the initial condition of a subsequent switching step equating the final state of the previous switching cycle.
We introduce some approximations that hold for sufficiently strong saturation of the deactivation light, allowing us to derive a particularly simple, radially symmetric expression for p act (ì x) within the focal plane.
For strong saturation, the final state is independent of the initial state as all fluorophores switch to the equilibrium level.The equilibrium level is reached everywhere except close to the focal center.Close to the central zero of the doughnut-shaped spot of deactivation light, the intensity profile can be well approximated by a radially symmetric 2D parabola I(ì x) ≈ I max α|ì x| 2 with a pattern steepness parameter α, a maximal intensity on the rim of the doughnut distribution I max and ì x being a vector in the focal plane with the focal center as origin.For strong saturation, the width of the central peak of activated fluorophores will be far below the diffraction-limit, with a maximal on-state level p on at the center.The effective activation probability in the focal plane, using the Gaussian peak approximation, is then, with the equilibrium level p ∞ .This distribution describes a 2D Gaussian peak function with a full width at half maximum (FWHM) w RESOLFT and a constant offset.The FWHM of this peak follows the well-known inverse square root relation on the applied maximal irradiation dose I max t [8], with the energy of a photon of the deactivation light λ/(hc) and the corresponding total switching cross-section σ (see also Appendix B).From Eq. ( 6), it becomes apparent that the sharp, central spot of activated fluorophores is contrasted against a constant background of fluorophores in the equilibrium state.
The deviation between the used approximations and calculations of focal intensity profiles based on diffraction-theory is shown in Fig. 2. As already mentioned, the deviation is negligible at strong saturation levels.

Signal during readout
We assume that the readout duration is chosen sufficiently small so that during the readout, no additional switching takes place.The signal during readout s read (ì x) is assumed to be collected in a confocal excitation and detection scheme.It can then be described by its shape distribution h read (ì x), normalized so that h read (0) = 1, multiplied by a brightness factor b, s read (ì x) = bh read (ì x) .
The focal readout brightness b is the average signal of an activated fluorophore at the focal center during readout.It summarizes optical and spectral properties of the fluorophores, as well as the detector integration time, and is regarded to be constant and common to all fluorophores within the image region.The spatial constancy of b is an assumption, which needs to be verified in real measurements by means of statistical testing (see the Discussion and Appendix F.3).For confocal or STED imaging conditions during readout, the shape of the signal distribution within the focal plane is often well approximated by another 2D Gaussian peak function [21] with w read the FWHM of the 2D Gaussian peak describing the signal collection.

Estimation of the focal readout brightness b from signal mean and variance
To obtain information about the statistical behavior of the signal, we model the signal as the result of a two-stage stochastic process, where the first stage is the activation and the second stage is the photon emission by activated fluorophores.The activation is modeled as a Bernoulli process with a spatially dependent activation probability distribution.The states of the fluorophores are represented by Bernoulli random variables A j with the effective activation probability p act (ì x j − ì s) for a single fluorophore at position ì x j and scan position ì s.The fluorophore indexed j is active (A j (ì s) = 1) or inactive (A j (ì s) = 0) depending on the value of the random variable A fluorophore that remains in the on-state will on average yield bh read (ì x j − ì s) detected photons, which we model by a Poisson distribution with mean value bh read (ì x j − ì s).In the off-state, it is modeled as not yielding any detectable photons.Therefore, the random variable describing the signal of a fluorophore B j (ì s) depends on the activation state A j (ì s): The mean and the variance of the signal of a single fluorophore are given by Here we used that activation and readout are independent processes.Additionally, we assume that each fluorophore switches its state and emits independently, so that the mean and the variance of the total signal measured at each scan position are simply the sum of the mean and the variance of each individual fluorophore.The total signal detected at position ì s is then given by where D(ì s) ∼ Poisson(d(ì s)) represents background noise with a mean value of d(ì s).We assume that the background contribution D(ì s) is independent of the emission from the fluorophores B j (ì s) and also uncorrelated with the background signal at other positions.Furthermore, the mean m(ì s) = E[Y(ì s)] and the variance v(ì s) = Var[Y(ì s)] of the total collected signal Y(ì s) at each scan position ì s can conveniently be expressed as convolutions with the density of fluorophores, The mean is the convolution of the number density with the MCF including an additional background while the variance exceeds the mean by an additional term with a convolution kernel b 2 h 2 read (ì x)p act (ì x)(1 − p act (ì x)) which is similar but not identical to the MCF.This additional variance, which is visualized in Fig. 3, can be used to estimate the focal readout brightness.
Note that for p act (ì x) equal to 1 we recover the situation of non-switchable fluorophores, the excess variance term vanishes and a purely Poisson distributed signal is regained.
For the case of photoswitchable fluorophores, the variance of the collected signal is augmented due to the stochastic nature of the activation in the preparation step.The excess variance or overdispersion term is proportional to the square of the focal readout brightness b, unlike the mean signal, which scales linearly with it.This relation can be used to estimate b directly from integrated mean and variance of the image data using a Method of Moments estimator (MME), to be derived in the following.
To estimate the mean and variance of the signal at position ì s, we require N ≥ 2 statistically independent images Y j (ì s) (j = 1, . . ., N) of the same structure of interest.The estimated mean m(ì s) and the estimated variance v(ì s) are obtained from the data, ( The pixel-wise estimation will still be influenced largely by the image noise.Also, the average signal and variance depends on the number density in the region of interest.The influence of the in general unknown and non-constant number density is overcome by integrating the mean and variance over the sampled region X ⊂ R 2 .Due to the convolution theorem, only the total number of molecules n X within the region appears in the resulting expression, such that it is possible to estimate the focal readout brightness within this region.The integrated mean M = ∫ X m(ì s) dì s and integrated variance V = ∫ X v(ì s) dì s of the model in Eq. ( 13) gives with the integrated background signal d total = ∫ X d(ì s) dì s , and ) dì s being integrals of products of activation probability distribution p act and readout PSF h read .Here, we assumed that the convolution kernels decay fast enough so that contributions of the fluorophores in the region X are mostly confined to the signal within the region.
The integrated mean M and integrated variance V can be estimated from the data by summing the estimated pixel mean and variances weighted with the area ∆ p of a pixel in the image, Equating the integrated model mean and variances in Eq. ( 15) with the integrated empirical mean and variances in Eq. ( 16), and solving for the focal readout brightness b, results in the MME b for the brightness, An expression for the variance of b in the limit of many fluorophores of known locations is given in Appendix F. For the special case of a single, isolated cluster of fluorophores, an analytic expression for the error in the estimation of b are derived in Appendix C. Essentially, the relative error of b depends mainly on the number of measurements, and is roughly independent of the number of molecules in the image.

Measurements of rsEGFP2 switching kinetics
In the theoretical part, the influence of the switching kinetics model on the shape of the MCF was derived.For a two-state switching model, the width of the central peak of the MCF is determined by the switching rate while the peak is offset by the equilibrium off-switching level.To compare the used switching model with switching of real fluorophores, we observed the switching kinetics of rsEGFP2, a widely used RSFP in RESOLFT nanoscopy [19].The measurements were conducted on a custom built confocal microscope with additional widefield illumination paths for excitation with a 491 nm CW laser and activation with a 375 nm CW laser, as described before (Fig. S2 in [22]).The signal was detected with a point-like detector (APD) collecting fluorescence at 525±25 nm.In this configuration, averaging effects of the observed kinetics due to an inhomogeneous illumination intensity distribution within the detected volume can be excluded.
The sample was prepared by mounting living E. coli cells overexpressing rsEGFP2 on a coverslip in an agarose gel.To measure the ensemble state of the photoswitchable proteins, a stable reference was prepared for each measurement by saturating the activation using a long exposure of 1.6 ms of UV light, up to 160 times longer than the UV pulse used for activation during imaging.The intensity dependence of the switching rate and the switching background was examined using widefield excitation light exposure of constant light dose (exposure time × intensity) using intensities up to 22 kW/cm 2 .
Due to the saturated activation, we assume that all molecules start in the on-state.The signal at any given time is then proportional to the population of rsEGFP2 in the on-state at that time.Results are shown in Fig. 4. As the signal shows systematic deviations from a single exponential decay especially for intermediate times, we devise a decay model assuming a Gamma distribution for the switching rate k with α the shape parameter and β the rate parameter, where the amplitude of the decay is given by A and the signal decays to a background level b.The total switching rate k can be directly measured as the inverse of the time τ = β(exp(1/α) − 1) until the signal drops to 1/e of the initial value.The ratio of the equilibrium signal after complete switching and the initial signal is used to estimate the equilibrium population p ∞ .
The switching rate depends linearly on the intensity for low intensities, up to a few kW/cm 2 .These are typical values for light intensities around the center of the intensity zero, which determine the resolution enhancement of the RESOLFT process.
The equilibrium population does not depend on the intensity used, and is around 2.5% for all intensities.

Estimation of the MCF for RESOLFT images of rsEGFP2-α-tubulin
We recorded RESOLFT images of Drosophila melanogaster larvae body wall muscles expressing ubiquitously rsEGFP2 fused to α-tubulin as described in [23].Tubulin is known to form helices with a diameter of approximately 25 nm, each turn comprising 13 dimers of αand β-tubulin and spaced 8 nm apart [24].The total density of α-tubulin along a single filament can thus be estimated to be ≈ 1625 per µm.
To analyze the ratio of labeled to non-labeled α-tubulin subunits in body wall muscles of Drosophila melanogaster, L3-larvae were dissected to isolate body wall muscles, which were subsequently used as a sample for Western blot analysis as described in Appendix A.2.
The ratio of α-tubulin to rsEGFP2-α-tubulin in the muscle tissue is estimated to be ∼ 1:6 (see Fig. 6).Therefore, we expect the number density of rsEGFP2 molecules along a microtubule fiber to be ∼ 230 per µm.

The shape of the MCF
The MCF of RESOLFT nanoscopy is the product of the effective activation probability p act (ì s) and the readout signal s read (ì s), as described in Section 2.2.For a two-state Markov model with a linear dependence of the rates on the applied light intensity, the MCF in the focal plane can be approximated as the superposition of two Gaussian peaks differing only in amplitude and width.Using the expressions in ( 6) and (7) gives where the width of the sharp RESOLFT peak w eff is given by By taking into account prior knowledge about the imaged object, the shape of the MCF may be estimated from image data itself [25].The structure in the image of microtubule fibers is known to consist of homogeneously labeled filaments with a width well below the resolution of the RESOLFT microscope.Therefore, we model the density of molecules in the image as assembly of lines.For sufficiently sparse filament structures and a high enough SNR, the filaments can be localized without quantitative knowledge of the MCF simply by detecting lines in the image based on the scale space representation of local line profiles [26].The resulting object model is considered proportional to the number density (see Fig. 5(b)).
Given the object model, the shape of the MCF and the constant background level can be estimated by fitting a superposition of two centered Gaussian peaks convolved with the estimated object to the image data using a maximum likelihood approach for Poisson noise (see Fig. 5(c,d)).
From the widths of the fitted Gaussian peaks, the confocal readout FWHM w read was estimated to be 235 nm and for the sharp, RESOLFT peak FWHM w RESOLFT a value of 73 nm was obtained.As the number density was determined only up to a multiplicative constant in the structure estimation, the absolute amplitude of the MCF remains undetermined.However, the ratio of the amplitudes of the two Gaussian peaks can be used to estimate the activation probability at the focal center p on , as the equilibrium population p ∞ = 2.5% is known from the kinetics measurements and assumed to be reached in the periphery of the focal spot (i.e.where the off-switching transition is strongly saturated).
With a ratio of the amplitudes of the sharp peak to the background peak of ∼ 6.9, we estimate that about 17% of the rsEGFP2 molecules at the focal center were effectively activated before readout.Using these values, the ratio of the integrated values of the convolution kernels H 1 /H 2 (see Eq. ( 17)) can be estimated to be ∼ 1.7.

Inference on the focal readout brightness b
To quantitatively estimate the MCF of the RESOLFT image, the multiplicative constant, given by the focal readout brightness b, has to be estimated.As was laid out in Section 2.3, this can be achieved by measuring the integrated mean and the variance of the image signal.For the estimation of the variance, at least two images of the same structure are required.In practice, this can also be achieved by acquiring one image with half the desired pixel size in the fast scanning direction.Splitting the image in two along the fast scanning direction results in two independent images each with square pixels and only a slight relative shift of half a pixel length in the splitting direction (here 12 nm).While this allows to obtain multiple images of the same structure uninfluenced by drift or photobleaching, it can introduce a bias in the empirical variance in Eq. ( 14).As shown in Fig. 8 in the Appendix, the magnitude of this bias depends on the sampling of the image and the orientation of structures in the image.We use an estimator for the variance with significantly reduced bias, derived in Appendix D (see also [27] for a detailed analysis of such difference-based variance estimators): where Y 2 (ì s) is a split image that has neighboring pixels averaged along the split direction.
Applying the method of moments estimator for the focal readout brightness in Eq. ( 17) on the whole image shown in Fig. 5(a) yields a value of b of ∼ 0.9 photon counts per activated rsEGFP2 molecule.Analyzing the time resolved readout signal showed that during readout the effective change in the on-state population was less than 10 %.In Fig. 9, the readout brightness is estimated locally where only the data of a small region is used in each case.The estimated values vary between 0.6 and 1.1, but no systematic dependence on the location of the region is visible.

Counting the number of molecules along filaments
The total contribution S of each fluorophore to the RESOLFT image can be estimated by integrating Eq. ( 19) using the estimated shape and focal readout brightness.The average value of S = 3.58 photons per rsEGFP2 molecule can be used to quantify the number of fluorophores in a region in the image using Eq. ( 4).The results of applying the estimator to the regions marked in Fig. 5 are shown in Table 1.The average number of rsEGFP2 molecules per µm along a microtubule filament is estimated to be ∼ 180, which is similar but below the expected number density.

Simulation of number and brightness analysis for a known MCF shape
To approximate the error of this estimation, we conducted a Monte Carlo simulation implementing the model for the image formation process.Placing a certain number of molecules uniformly on the identified structure and using the obtained effective RESOLFT PSF, images were simulated (Fig. 5(e)).For each scanning position, the activation of the molecules was simulated by drawing binomial random numbers according to Eq. ( 9).From the aggregated mean signal from all active molecules, Poisson distributed random numbers were generated to represent the measured image.
On the whole image, first the focal readout brightness was estimated, then the total number of fluorophores in the image was counted.
The estimated total number and focal readout brightness values of the simulated images are shown in Fig. 5(f).As the total number of detected photon counts is the product of the total number and the signal per fluorophore, there is a strong reciprocal correlation between these values.However, the relative deviation from their mean is well below 10%.Moreover, in Appendix F we derive an analytic expression for the variance of the estimator of b.In this simulation setting, it predicts a value 4.9% for the relative standard deviation of the estimator from its mean, and the value 4.1% for the standard deviation.
To test if variations of the brightness over the field of view can be detected using our method, we simulated 1000 images for different brightness values from the set B = {0.05,0.1, 0.15, . . ., 1.95, 2.0} and compared estimated brightness values from one half of the image simulated with brightness b 1 ∈ B with estimated brightness values from the other half of the image simulated with brightness b 2 ∈ B by applying the 2-sample Welch test [28, p. 447] (see Appendix F.3 for details).As shown in Fig. 10, differences in brightness between both halves of the image of 0.5 can be detected with about 80% probability except when either b 1 or b 2 are very small.Note that this figure depends on the distribution of fluorophores in the image, and is thus not universal.

Discussion & outlook
We derived a rigorous framework for counting molecules in fluorescence microscopy based on the molecular contribution function representing the average contribution of a single fluorophore to the image.The image can be represented as the number density of fluorophores convolved with the MCF.
The MCF was calibrated for the case of RESOLFT nanoscopy by first estimating the shape of the MCF from the image data using a model for the structure of fluorophores in the sample, and then quantifying the focal brightness using the mean and the variance of the measured fluorescence signal.The excess variance term, caused by the stochastic nature of the fluorophore activation in RESOLFT, scales quadratically with the focal brightness per fluorophore.This enables the quantitative estimation of the MCF for RESOLFT nanoscopy from multiple images of the same structure.
The estimation of the shape of the MCF, as well as the focal activation probability requires an accurate model of the switching kinetics of the used reversibly switchable fluorophore.Reversibly switchable fluorescent proteins have been shown to undergo switching by a combination of processes like cis-trans isomerization, protonation, and short scale interactions in the protein [29].Our kinetics measurements of rsEGFP2 suggest that a two-state Markov model for the switching between a fluorescent on-state and a non-fluorescent off-state can be applied to estimate the on-state population of rsEGFP2 with sufficient accuracy.The observed uniform width of the α-tubulin is a strong indication that the resolution and the switching kinetics remain the same across the entire field of view.
This model does not take into account the saturation of the switching rate seen for higher intensities and experimental deviations from the exponential decay predicted in the Markov model.By modeling the switching kinetics more accurately, the shape of the MCF and the activation probability may be estimated more precisely.Any relative error in the estimation of the activation level p on will also directly be present in the estimated number of fluorophores.
To quantify the focal brightness per fluorophore, the mean and the variance of the signal were calculated for a stochastic model with a binomial activation of the fluorophores and a Poisson model for the detected signal from each fluorophore.In this model, the variance depends quadratically on the focal brightness, in contrast to the mean signal, which has a linear dependence.This allows inferring on the focal brightness of the fluorophores intrinsically from the image data using a Method of Moments estimator.
The main difference to earlier work about using fluorescence fluctuations to count molecules ( [15]) is that our method counts molecules in at least temporarily immobile bright fractions of the sample, while these earlier methods quantified mobile fractions of the sample.
We applied our framework to RESOLFT recordings of rsEGFP2 fused to α-tubulin in Drosophila melanogaster larval muscle tissue.The estimated number of rsEGFP2 molecules per µm microtubule filament in Drosophila melanogaster larval muscle tissue is similar but below the value derived by quantitative western blot analysis for dissected muscle tissue.This might hint at a less efficient incorporation of rsEGFP2 fused to α-tubulin.
The focal activation level was estimated to be only ∼ 17%, which limits the emission from the sharp, central peak.The reason for this may be a non-zero intensity at the center of the used doughnut-shaped off-switching light distribution or an incomplete activation in the on-switching step.However, due to increased photobleaching, it may not be desirable to activate a larger fraction of molecules at each scan position.
We assumed that active fluorophores emit fluorescence independently.While interaction of multiple close-by fluorophores has been reported, the average distance of active fluorophores at the focal center during readout for our measurement in Drosophila melanogaster is about 30 nm, a quenching interaction seem unlikely.
The focal brightness is calibrated assuming that it does not change over an image region.This assumption is essential for the method, and we validate it on the image scale by means of statistical testing.We performed a Welch test respecting the potentially inhomogeneous variance (see Appendix F.3 and [28], p. 447) at level α = 0.01 for the hypothesis that two halves of the image have equal brightness, and the test is not able to reject the hypothesis.The detection power of the test is analyzed in simulations of the image structure, which show that the test can detect differences of 0.5 in the brightness with roughly 80% probability.Because the focal brightness is calibrated from the data, a variable brightness between different microscopes, samples or parts of a sample does not compromise the counting ability.
Our simulations showed that the counting error, apart from possible inaccuracies in the MCF model, depends mainly on the number of repeated measurements and the brightness.The achievable number of measurements is limited by the required recording time and the number of possible switching cycles per fluorophore.Future reversibly switchable fluorophores with a lower tendency to bleach may enable a larger number of measurements.
The theoretical considerations as well as the experimental results were limited to 2D images and thin samples corresponding to the commonly used RESOLFT recording mode.The extension to 3D is uncomplicated in the theory and additionally requires only a sufficiently good calibration of the 3D shape of the MCF.
The excess variance of the signal features a different convolution kernel from the mean signal (see Eq. ( 17)), so that Poisson-likelihood based image deconvolution methods cannot be expected to be accurate.However, for our images the standard Richardson-Lucy method performs sufficiently well.In general, special deconvolution techniques adapted to the particular statistics of RESOLFT images might be needed.
Our counting framework maps the number of switchable fluorophores.To quantify the number of biological molecules of interest, the degree, efficiency and specificity of the labeling has to be controlled as well.With an optimal labeling of the target structure with stable labels, this technique should provide an accurate counting of molecules on the nanoscale.

A.1. Sample preparation for switching kinetics measurements
A plasmid, encoding the sequence for rsEGFP2 was cloned into the high-copy expression vector pQE31 (Qiagen) and used in E. Coli cells to obtain highly concentrated proteins in a cellular environment.Direct observation of protein properties in the cytosol was accomplished by immobilizing the bacteria in agarose gel (2%) on standard coverslips (20× 20 mm, #1.5) mounted to an object slide.

A.2. Western Blot Analysis of Drosophila melagonaster larvae
The front half of one wandering third instar larvae expressing rsEGFP2-α-tubulin was dissected in 1x PBS to remove all tissues except body wall muscles.Subsequently the cuticle with attached body wall muscles was transferred to a new Eppendorf tube and mixed with Laemmli Buffer (Sigma-Aldrich, St. Louis, Missouri, USA) and denatured for 10 min at 95 • C before loading onto a SDS-polyacrylamide gel.The protein-lysate of several larvae was mixed and the amount of approximately one larva was loaded into each lane of the gel.The SDS gel was blotted using a semi-dry blotter onto a Nitrocellulose membrane.The protein-blot was blocked using 5% fat free milk powder dissolved in 1 x PBS for 1 hr.The membrane was cut into several pieces and incubated with antibodies against GFP and alpha-tubulin.The polyclonal α-tubulin antiserum (Abcam, Cambridge, England) and the anti-GFP antibody (Clontech, Mountain View, United States) were used in a concentration of 1:2500 in 2,5% fat free milk powder dissolved in 1x PBS for 1 hr.HRP-conjugated secondary antibody (Jackson ImmunoResearch, Suffolk, UK) was used at a dilution of 1:5.000 in 2,5% fat free milk powder dissolved in 1x PBS for 1 hr and detected with an ECL-kit (Perkin Elmer Life Science, Waltham, USA) using an Amersham Imager 600 (GE Healthcare, Little Chalfont, UK).

A.3. Imaging parameters for the RESOLFT images
The RESOLFT image shown in Fig. 5 was taken as described in [23], using an illumination with 2.8µW for 15 µs for the diffraction-limited activation with a 405 nm laser, an illumination with 17.8µW for 500 µs using a doughnut-shaped focus of 488 nm light, and finally a diffraction-limited excitation with 24.2µW using 488 nm light for 100 µs.All powers were measured in the back aperture of the objective lens.For the RESOLFT image, the first 30µs of the readout time were integrated.

B. Two-state Markov switching model with linear rates
We assume a stochastic model for the memory-less switching between a single fluorescent on-state and a single non-fluorescent off-state.A scheme of the model is depicted in Fig. 2(a).The switching rates k on and k off determine the time evolution of the state populations.The time evolution of the on-state population p on is given by an exponential decay with an effective switching rate k = k on + k off , starting from the initial state p 0 and ending in the equilibrium state where the equilibrium state is given by With the population of the off-state being 1 − p on (t), the two-state system is fully characterized.For light driven rates that depend linearly on the intensity of the applied light we can define the switching rates as k on = σ on λ/(hc)I , where σ on and σ off are effective absorption cross-sections for the on-and off-switching, respectively, hc/λ is the energy of a single photon and I is the intensity of the applied light.Defining the effective total absorption cross-section σ = σ on + σ off , the on-state population only depends on the initial state and the irradiation dose It:

C. Brightness and number of fluorophores on single clusters
For the case of a single, isolated cluster consisting of n 0 molecules all at the same position, the estimation of the focal readout brightness b as well as the number of fluorophores is especially straightforward.We assume that N measurements are performed with the cluster location coinciding with the focal center.All fluorophores in the cluster then are subjected to the same effective activation probability p on = p act (0) and the average readout signal is simply b.

C.1. Analytic results
In the following, we assume that the average number of background counts d and the activation probability p on are known with no or negligible error.Following Eq. ( 13), the mean m and the variance v of the measured signal are then given by Equating mean and variance with the sample mean m and the sample variance v from N measured signals {Y j } j=1,...,N and solving Eq. ( 26) for number and focal brightness, gives the estimators n0 and b for the number of molecules n 0 and the focal brightness b using the method of moments n0 = 1 − p on p on We approximate the standard deviations of the estimator for the mean ∆m = Var[ m] and of the estimator for the variance ∆v = Var[v] assuming that the signal follows a normal distribution with the known expectation and variance.This yields By propagating the error for the estimators (27), we get the following result for the relative errors of the estimation of the number and focal brightness: By only keeping the leading term (∆v) 2 /(v − m) 2 under the square root we find a joint, simple expression for a lower bound of the relative error of both the estimated number and the estimated brightness: Here SBR denotes the signal to background ratio, given by Interestingly, the expression (30) for the lower bound of the relative error does not depend on the number of molecules except in the contribution of the signal to background ratio.This result is very similar to corresponding analyses in fluorescence fluctuations theory ( [30]).The number of measurements N, the brightness b and the activation probability p on mainly determine the relative errors.

C.2. Simulation
We simulated RESOLFT data for the simple case of measuring a single, isolated cluster of n 0 fluorophores located directly at the focus.The brightness b, assumed to be equal for all molecules, and the number of repetitions of the measurements were varied.The number and the brightness were estimated according to Eq. ( 27).The relative errors of the estimation are shown in Fig. 7.The lower bounds derived analytically are shown as comparison.The simulations confirm that the true number and brightness can be inferred accurately.However, the estimates for the number and focal brightness are highly correlated and will, in general, be scattered along a hyperbola in number and brightness space.The relative error of brightness and number is rather independent of the number of molecules and also rather independent of the brightness above a certain brightness threshold of 1-2 photons.The error decreases with the square root of the number of repetitions.Above 500 measurements, the relative error is below 10%.

D. Estimator of the variance for shifted images
Due to the image acquisition procedure described in Section 3.2, the recorded image is split in one direction to deliver two independent images Y 1 and Y 2 with square pixels of the same structure but shifted by half a pixel with respect to each other.Depending on the structure, this can lead to a bias in the estimation of the focal brightness, as there are two contributions to the variance: first, the variance due to the switching fluorophores and the photon detection, which is included in our model, and second, the variance due to changes of the structure on the scale of the pixel shift.In the estimator for the focal brightness devised in Section 2.3, this second source of variance is not included.However, the bias due to this effect can be suppressed by estimating the pixel-wise variance in the following way: we examine the squared deviation of the signal in each pixel of one of the images Y 1 to the average signal of the pixels measured directly before and after this pixel in Y 2 : If the variance of Y 2 (ì s − ) equals the variance of Y 2 (ì s + ) and of Y 1 (ì s), the expectation value of this quantity is Such variance estimators were analyzed in [27].To show that this estimator reduces the bias due to the structure in the image, we conducted simulations of line-like structures, both parallel and perpendicular to the scanning direction.The scanning process was simulated realistically by evaluating the signal at positions shifted by half a pixel.The simulation results shown in Fig. 8 demonstrate, that the bias is pronounced mainly for structures that are perpendicular to the scanning direction and is reduced strongly using the estimator (33).

E. Estimation of the brightness in regions within images of rsEGFP2-α-tubulin
The data of Fig. 5(a) was taken and regions were defined (see Fig. 9(a)).Then the brightness was estimated equivalently to Sec. 3.2 but only with the data of each region.The estimated brightnesses are shown in Fig. 9(b).The average estimated brightness of all regions is 0.79 with a standard deviation of 0.16.A systematically inhomogeneous brightness distribution like for example a gradient of the brightness cannot be observed.If the regions would have been made smaller, the fluctuations in the estimation of the brightness would naturally have increased.Additionally, the differences in estimated brightness using only the left and right half of the image, the upper and lower half of the image and the inner and outer half of the image were calculated.The differences in the estimated brightness between these half-image regions was always smaller than 2%.

F. Variance of the estimated focal readout brightness b
The estimator b in (17) with the bias correction from Section D has the form where Y(ì x) are random variables following the model from Eqs. ( 9), ( 10) and ( 12), and d(ì x) is an estimator for the intensity of the Poisson background d(ì x).The constants H 1 and H 2 are defined in the line following Eq.( 15) in the main text (see also Assumption 2 below).For simplicity of notation, we do not write explicitly in Eq. ( 34) and in the following the sample splitting used in Section D to estimate the variance.Instead, we implement the idea of Section D by subtracting the mean of the neighbor pixels values 1 2 Y(ì x − ) + Y(ì x + ) in the numerator of (34).Here, ì x − and ì x + denote the positions of the pixels before and after ì x along the scanning direction (see Section D for details).
In this section we derive an asymptotic expression for the variance of b in the limit of many fluorophores.This limit is realized by a series of experiments where the sampled region X k grows in size with k and where the number of fluorophores n (k)  0 in X k also grows with k.Each experiment yields a set of observations {Y k (ì x)} ì x∈X k , which correspond to photon counts at the scanning positions ì x ∈ X k .In the following, all limits correspond to the asymptotics where the size of the sampling region |X k | and the number of fluorophores n (k)  0 in X k tend to infinity.Note that in the estimator (34) we need an estimator dk (ì x) for the intensity of the Poisson background.The estimator is constructed by selecting a subset X k of the sampled region where no signal is present, and using the mean signal in that region as an estimator for the background, i.e., dk In particular, the estimator dk (ì x) is spatially constant, and hence is expected to perform badly if the true background has strong variations in its intensity.If the background has approximately constant intensity, then dk (ì x) is expected to perform better the larger the set X k is.We will hence assume that the set X k grows in size as k → ∞, but it remains small in comparison with X k .Finally, in order to simplify the computations, we will split the sampled region X k into the sets X k and X k = X k \ X k , and use the first region to estimate the Poisson background, and the second to estimate the variance and the expectation of the signal.Hence, for each k ∈ N, our estimator for the brightness is given by bk where ).We make the following assumptions: 1) The density of fluorophores converges to a constant ρ ∈ (0, ∞), i.e., lim Moreover, the local density of fluorophores is also bounded by a constant ρ ∈ (0, ∞), i.e., lim 2) The PSFs h read and p act do not change with k, and have bounded support: supp h read ⊂ S, supp p act ⊂ S, with |S|<∞.Moreover, the quantities converge to constants H 1 and H 2 as k → ∞.
3) The background noise is independently Poisson distributed D(ì x) ∼ Poisson(d(ì x)) with intensity satisfying Our analysis below uses the fact that D>0, i.e. that there is background noise.We remark nevertheless that under good experimental conditions, the average noise level D might be quite small.Finally, the region X k is chosen such that the estimator dk (ì x) in (35) satisfies as k → ∞, for some constants C 1 , C 2 >0.Given the structure of dk (ì x) in (35), these hold for instance if the signal in the region X k is pure background, the background intensity is constant, and lim k→∞ Then the estimator bk in (36) can be written as In the following we will show that the estimator bk satisfies a central limit theorem of the form as k → ∞ for a certain asymptotic variance σ 2 .This will be proved in the following steps: 1) in Proposition 1, we show that the joint distribution of the quantities 1 2) in Proposition 2, we apply the delta method to show that (37) holds.
In these two steps, we will see that the asymptotic variance σ 2 in (37) can be computed explicitly.Its expression is nevertheless very involved, and hence not very informative.In order to derive a useful approximation for the asymptotic variance of bk , we will approximate the random variables Y k (ì x) by normal random variables with matching expectation and variance, independent from each other.This approximation is motivated by the fact that the distribution function of a Poisson random variable with large intensity can be well approximated by the distribution function of normal random variable with matching expectation and variance.In fact, the approximation error can be shown to behave like the inverse of the square root of the intensity [31].In our case, the intensity will be large if there are many fluorophores in the sample, if their activation probability is not too small, and if the background noise is sufficiently large.We remark that our observations Y k (ì x) are not Poisson, but Poisson conditionally on the molecules being active, which results in an increased variance.We compensate for this by approximating Y k (ì x) by normal random variables with the same increased variance.
We will further assume that the expectation E[Y k (ì x)] and the variance Var Y k (ì x) vary slowly with the position, i.e.
. This assumption is justified by the fact that the pixel size is smaller than the scale at which the PSFs vary.
Under these approximations, we show in Section F.2 that the asymptotic variance σ 2 in (37) can be approximated by ), and hence is expected to perform poorly when b is small.On the other hand, σ2 behaves like b 2 for large brightness b, which reflects the natural fact that, the larger the parameter to be estimated, the larger the absolute error made.Finally, notice that the relative standard deviation of bk , given by σ b , tends to a constant for large brightness.We verify this effect in the simulations shown in Fig. 7.
We remark that Eq. ( 38) is a generalization of the result in Section C. In fact, the equation derived there is essentially (38) keeping only the term 2 , which is in that situation the dominant term.
Finally, we stress that Eq. ( 38) is valuable inasmuch as it characterizes the error made by the estimator bk .Indeed, if we know the parameters of the problem, i.e., PSFs, brightness, noise level and distribution of the fluorophores, then the standard deviation of the estimator bk can be estimated In the following Sections F.1 and F.2, we prove the asymptotic normality of bk and derive the expression (38) as an approximation for its asymptotic variance.

F.1. Asymptotic normality of bk
Here we will prove (37) as sketched in the previous subsection.First, we prove asymptotic normality of the joint distribution of 1  √ . Proposition 1 In the model presented in Section F with assumptions 1), 2) and 3), the limit exists and is a covariance matrix.Moreover, we have as k → ∞.
The main difficulty in proving Proposition 1 lays in the correlation structure of the random variables W k (ì x) and Z k (ì x).In order to take care of the correlations, we use a central limit theorem from [32] for m-dependent triangular arrays.In our case we have m = 3, since the random variable W k (ì x) is correlated with W k (ì x − ) and W k (ì x + ) and is independent of all other W k (ì x ).For each k ∈ N, the sequence W k (ì x) forms a 3-dependent sequence of random variables indexed by ì x.
Proof of Proposition 1: First, we show that Σ is a covariance matrix.Since Σ is symmetric, it is enough to show that t T Σt > 0 (40) for all t ∈ R 2 with t (0, 0) T .In the proof of Eq. (45) below we show that, for any k ∈ N, for any t ∈ R 2 , t (0, 0) T , for a constant C>0 independent of k, and .
Since Σ = lim k→∞ Σ k , Eq. ( 41) implies (40).Next, we prove the convergence in (39).For that, we use the Cramér-Wold argument (see Theorem 29.4 in [33]), which in our setting states that (39) holds if and only if the one dimensional limit 1 holds for any t = (t 1 , t 2 ) with t 2 1 + t 2 2 = 1.Notice that the variance of the right-hand side is In order to prove (42), we apply Theorem 2.1 in [32].It applies to our situation, since we consider the triangular array t 1 W k (ì x) + t 2 Z k (ì x), k ∈ N, ì x ∈ X k of 3-dependent random variables.For simplicity of the statement and the computations, we will re-index the random variables W k (ì x) and Z k (ì x) by an index i = 1, . . .
are ordered so that W k,i , W k,i−1 and W k,i−2 have nonzero correlation.Assume for the moment that, for fixed t 2 1 + t 2 2 = 1, the conditions Var hold for some constants C 1 , C 2 , C 3 >0.Then Theorem 2.1 in [32] states that 1 Var holds as k → ∞.In our original notation, this means that where C > 0 is a constant and we used that W k,i and Z k,i are polynomials in Y k (ì x i ), Y k (ì x i−1 ), Y k (ì x i−2 ) and dk (ì x i ), the maximal power of the product polynomials is 8, and the lower order moments of random variables can be bounded by the higher order moments.Now, notice that by assumption, the intensity d(ì x i ) of the background is bounded for all ì x i .Hence, E[ dk (ì x i ) 4 ] is bounded.On the other hand, the random variable Y k (ì x j ) is the sum of independent Poisson random variables with Bernoulli weights (recall the model in Eqs. ( 9), ( 10) and ( 12)).By the assumption of finite support of the PSFs and the finite density of molecules, we can bound |Y k (ì x j )| by the number of fluorophores that contribute to the sum, which is finite, times the maximum of a finite number of Poisson random variables with bounded intensity.The eighth moment of such a random variable is finite, and we conclude that E[Y k (ì x j ) 8 ] ≤ C<∞.Altogether, we conclude that (43) holds.

For (44), we bound Var
with an argument as above to bound the covariance.Now, the terms Var W k,i and Var Z k,i can be shown to be bounded with an argument as in the proof of (43).Indeed, there we showed that the fourth moment of W k,i is bounded, which implies that Var W k,i is bounded.The term Var Z k,i can be bounded in the same way, and we conclude that (44) holds.
For (45), we have to bound the total variance Var Obviously, the total variance will be bounded from below if the observations Y k (ì x i ) have a variance bounded from below, but to verify the quantitative statement of condition (45) we would require lengthy computations.
Alternatively, we can use the fact that the background noise and the estimator dk (ì x) give a nonzero contribution to the variance.Indeed, writing where the covariance term vanishes since dk (ì x i ) is independent of W k,i and Y k (ì x i ) by the sample splitting used to compute dk (ì x i ).Hence, if we have |t 2 | ≥ for some fixed > 0 to be specified below, assumption 3 at the beginning of Section F gives holds for k large enough and a constant C > 0, which is what we wanted to show.If |t 2 | < , then we can express the first term in (46) as the sum of variances and covariances and get for k large enough, and that where the last equality follows by the definition of W k,i , which gives Assume without loss of generality that t 1 , t 2 ≥ 0.Then, if t 2 < with = 1 √ 6 , we have and hence we have It is not difficult to show that the variance of |X k | i=1 W k,i is bounded from below by the variance of the background noise Var Hence the lower bound follows, and we conclude that (45) holds.
We will now use the delta method (see Sections 3.1 and 3.3 in [34]) to prove the asymptotic normality of bk .
Proposition 2 With the notation and assumptions of Section F, we have as k → ∞, where with Σ i,j being the entries of the covariance matrix in Proposition 1, H 1 and H 2 the constants in Assumption 2, and ρ the density in Assumption 1.

Proof of Proposition 2:
We prove the proposition by applying the delta method to Eq. (39).Defining Multiplying by , and using Assumption 2 and the fact that b = H   As stated above, we assume that the expectation and the variance of the signals vary slowly, so that we can approximate m k (ì x) ≈ m k (ì x ± ) and v k (ì x) ≈ v k (ì x ± ).Define the quantity In this setting, we can compute Moreover, the covariance term between ì x∈X k W k (ì x) and ì x∈X k Y k (ì x) − dk (ì x) is given by using the assumption that the expectation and the variance vary slowly.In the limit k → ∞ this gives Using again the assumption that m k (ì x) and v k (ì x) vary slowly, we can approximate and hence since the first term can be computed explicitly and equals ρb(H 1 + bH 2 ) + D in the limit k → ∞.

F.3. Verification of the homogeneity of the brightness
To test whether local differences in brightness can be detected by our method, we applied the following testing procedure to a set of simulated measurements for the model structure (see Section 3.3).1000 images were simulated for every brightness from the set B = {0.05,0.1, 0.15, . . ., 1.95, 2.0}.Each simulated image was evenly split in 8 parts (2 rows with 4 columns).We then took the 4 parts of the first row from a simulation with brightness b 1 ∈ B and the 4 parts of the second row from a simulation with brightness b 2 ∈ B. For each part, we estimated the brightness as described in Section 2.3.We then tested for homogeneity by applying the 2-sample Welch test ([28, p. 447]) to the results from the first row and the second row.The Welch test tests whether two samples have equal means without assuming that their variances are equal.Equality of variances can not be assumed here since the two samples stem from different parts of the image which are in general different.We used the test level α = 0.01.The relative frequency of rejection of the null hypothesis that the means are equal is shown in Fig. 10 for different values of b 1 and b 2 .It is striking that the power curves are not symmetric; this is a consequence of the fact that the two samples come from different parts of the image.For low values of b 2 the variance of the second sample becomes very high such that the null hypothesis can not be rejected.A related phenomenon occurs for small b 1 .Ignoring this effect, differences of 0.5 in the brightness can be detected with roughly 80% probability at this confidence level.

Fig. 1 .
Fig. 1.The molecular contribution function (MCF) (center, 2× magnified, with a total contribution of 20 photons of a single molecule the image) quantitatively relates the number and positions of molecules n 0 = 50 shown in the molecular map (left) with the total number and distribution of observed photons in the image (right).The number of molecules in a region could be estimated by adding the recorded photons and dividing by the integrated MCF.

Fig. 2 .
Fig. 2. Image formation in RESOLFT nanoscopy.a: Two-state switching model with a fluorescent on-state and a non-fluorescent off-state.The on-and off-switching rates depend on the laser intensity.b: Time evolution of the on-state population for the model in (a) and for a given total switching rate k on + k off .c: Left: Profile of the doughnut-shaped deactivation light (calculation using vectorial diffraction theory, λ = 491 nm, NA=1.4(oil), blue) and corresponding parabolic approximation around the focal center (red, dashed).Right: Population of the on-state after application of the on-and off-switching light (blue) and with spatially constant on-switching and parabolic off-switching approximations (red, dashed).The saturation level (σIt) of the off-switching is quite high (equals 20 on the rim of the doughnut distribution).d: Left: Confocal readout PSF (calculation using vectorial diffraction theory, λ excitation = 491 nm, λ detection = 525 nm, NA=1.4(oil), blue) and Gaussian peak approximation of equal (red, dashed) Right: Resulting shape of the molecular contribution function (MCF/b = p act h read ) with calculated illumination and detection distributions using diffraction theory (blue) and with approximations (red, dashed).

Fig. 3 .
Fig. 3. Concept for counting in RESOLFT nanoscopy.Left: Depiction of a cluster of n 0 = 20 molecules activated with different activation probabilities.Right: Photon counting histogram of the number of photons measured for different activation probabilities, while keeping the average number of photons per fluorophore constant.The variance of the distribution depends on the molecular parameters non-linearly.Mean and variance allow to calibrate the brightness per fluorophore.

Fig. 4 .
Fig. 4. Switching kinetics measurements for rsEGFP2.a: Single switching curve with an off-switching light intensity of 500 W/cm 2 .Fits for exponential decay (red) and gamma distributed exponential decay (yellow) with residuals shown below.b: Measured switching rate k (inverse of the time where the signal drops to 1/e of the initial value) in dependence of the off-switching light intensity.c: Equilibrium off-switching level (average level of the fluorescence in a at long times) in dependence of the off-switching light intensity.

Fig. 5 .
Fig. 5. Measurement of rsEGFP2-α-tubulin and estimation of the shape of the MCF.a: RESOLFT 2D image of dissected body wall muscles of wandering third instar Drosophila melanogaster larvae expressing rsEGFP2-α-tubulin.b: Line detection of the structures in the image in a. c: Result of the PSF estimation (superposition of two Gaussian peaks).d: Calibration of the activation probability (setting the equilibrium level to the one observed for rsEGFP in kinetics measurements).e: Simulated image resembling the image of a by taking the estimated structure from b and uniformly drawing a certain number of molecules from it.With the estimated MCF in c, noisy images can be simulated.f: Histograms of estimated readout brightness and estimated total fluorophore number for 1000 simulations.The true readout brightness and number of molecules in the image were chosen to equal the estimated readout brightness and number determined from the measurement.Scale bars 1µm (a,b,e), 200nm (c).

Fig. 6 .
Fig. 6.Western blot analysis of body wall muscle protein lysate of rsEGFP2-α-tubulin expressing Drosophila melanogaster third instar larvae.a: Comparison of the signal from rsEGFP2-α-tubulin (A) to the signal of unlabeled α-tubulin (B) via labeling using antiserum against α-tubulin.Ratio was determined on the total signal intensity of the respective peak.b: Control staining using antiserum against GFP.

Fig. 7 .
Fig. 7. Simulation of isolated clusters of molecules.Relative standard deviation of the estimated number (a) and estimated brightness (b)for n 0 = 20 molecules in one cluster, with an activation probability of 20%.Above a minimum threshold brightness, the counting error depends mainly on the number of measurements (N).Full simulation results are plotted as dots, the calculated expressions of Eq. (30) are shown as lines.

Fig. 8 .
Fig. 8. Influence of the pixel shift between the two images Y 1 and Y 2 after splitting the recorded image.Simulations of images with line-like structures parallel ( ) and perpendicular (⊥) to the scanning direction.The biased estimator simply ignores the pixel shift, while the unbiased estimator implements Eq. (33).Scale bars, 200nm.

Fig. 9 .
Fig. 9. Estimation of the brightness in regions of the image of Fig. 5(a).The regions were chosen manually to adapt to the structure and include similar amounts of structure.Their boundaries are shown as green lines in a.In b the estimated brightness for each region using only data from this region is shown as color in the respective regions.
for large k, where m k (ì x) = E Y k (ì x) and v k (ì x) = Var Y k (ì x) .Using this approximation and the limit theorem (37) for bk , we estimate the standard deviation of the estimator bk by σ √ |X k | .Notice that σ2 behaves like b −2 for small values of b.This is consistent with the fact that our method of moments estimator is based on the relation Var[Y k (ì x)] − E[Y k (ì x)] = O(b 2 by σ √ |X k | .We compute this quantity in the setting of the simulations of Section 3.3, which gives the standard deviation and relative standard deviation σ |X k | = 0.041, σ b |X k | = 0.049, for the value b = 0.83 used in the simulations.The relative standard deviation computed from the simulations in Fig. 5(f) agrees with these predictions.
) holds as k → ∞, which is what we wanted to prove.It remains to verify the conditions (43) to (45).For condition (43) with δ = 2, we use that |t 1 |, |t 2 | ≤ 1 and bound |X k | i=1 D i , which by Assumption 3 behaves like D|X k | asymptotically.

F. 2 .
Approximation of the variance of bProposition 2 gives us an explicit expression for the asymptotic variance of bk .This expression is nevertheless very involved and not very informative.In this section we will approximate this expression for the variance by approximating the Poisson random variables Y k (ì x) by normal random variables with matching expectation and variance,Y k (ì x) ∼ N E Y k (ì x) , Var Y k (ì x) , ì x ∈ X k ,independent from each other.In the following we will use the notation introduced in (13), i.e.,m k (ì x) = E Y k (ì x) , v k (ì x) = Var Y k (ì x) .

Fig. 10 .
Fig. 10.Empirical power of the Welch test applied to two samples of the simulated images in Fig. 5(e) for different brightnesses.The first sample was taken from a set of four possible area from the upper half and the second sample from four possible areas from the lower half of the simulated images.The brightness of the first sample is b 1 and that of the second sample b 2 .The nominal test level was α = 0.01.