Nanoscopy based on single emitter switching Claas

We propose and analyze a method for isotropic resolution in far-field fluorescence nanoscopy based on switching and mathematically localizing individual emitters. Under typical imaging conditions, the coherent detection of fluorescence light through two opposing high angle lenses strongly improves the 3D-resolution down to 5− 10nm in all directions. Furthermore, we give a detailed analysis of the resolution of this and other single molecule switching based approaches using the Fisher information matrix. We verify the results by Monte-Carlo simulations of the imaging process and by applying a simple maximum-likelihood estimator for position determination. © 2008 Optical Society of America OCIS codes: (110.0180) Imaging systems : Microscopy; (110.2990) Imaging systems : Image formation theory; (180.2520) Microscopy : Fluorescence microscopy References and links 1. E. Abbe, “Beiträge zur Theorie des Mikroskops und der mikroskopischen Wahrnehmung,” Arch. f. Mikr. Anat. 9, 413–420 (1873). 2. S. W. Hell and J. Wichmann, “Breaking the diffraction resolution limit by stimulated emission: stimulated emission depletion microscopy,” Opt. Lett. 19, 780–782 (1994). 3. S. W. Hell, “Toward fluorescence nanoscopy,” Nat. Biotechnol. 21, 1347–1355 (2003). 4. S. W. Hell, “Far-field optical nanoscopy,” Science 316, 1153–1158 (2007). 5. G. Donnert, J. Keller, R. Medda, M. A. Andrei, S. O. Rizzoli, R. Lührmann, R. Jahn, C. Eggeling, and S. W. Hell, “Macromolecular-scale resolution in biological fluorescence microscopy,” Proc. Natl. Acad. Sci. USA 103, 440–445 (2006). 6. B. Harke, C. Ullal, J. Keller, and S. W. Hell, “Three-dimensional nanoscopy of colloidal crystals,” Nano. Lett. 8, 1309–1313 (2008). 7. S. Hell and E. H. K. Stelzer, “Properties of a 4Pi-confocal fluorescence microscope,” J. Opt. Soc. Am. A 9, 2159–2166 (1992). 8. S. W. Hell, E. H. K. Stelzer, S. Lindek, and C. Cremer, “Confocal microscopy with enhanced detection aperture: type B 4Pi-confocal microscopy,” Opt. Lett. 19, 222–224 (1994). 9. M. Nagorni and S. W. Hell, “Coherent use of opposing lenses for axial resolution increase in fluorescence microscopy. I. Comparative study of concepts,” J. Opt. Soc. Am. A 18, 36–48 (2001). 10. M. Dyba and S. W. Hell, “Focal spots of size λ /23 open up far-field fluorescence microscopy at 33 nm axial resolution,” Phys. Rev. Lett. 88, 163, 901 (2002). 11. R. Schmidt, C. A. Wurm, S. Jakobs, J. Engelhardt, A. Egner, and S. W. Hell, “Spherical nanosized focal spot unravels the interior of cells,” Nature Methods 5, 539 – 544 (2008). 12. M. Hofmann, C. Eggeling, S. Jakobs, and S. W. Hell, “Breaking the diffraction barrier in fluorescence microscopy at low light intensities by using reversibly photoswitchable proteins,” Proc. Natl. Acad. Sci. USA 102, 565–569 (2005). 13. W. Heisenberg, The Physical Principles of Quantum Theory (Chicago Univ. Press, Chicago, 1930). #102779 $15.00 USD Received 15 Oct 2008; revised 21 Nov 2008; accepted 24 Nov 2008; published 1 Dec 2008 (C) 2008 OSA 8 December 2008 / Vol. 16, No. 25 / OPTICS EXPRESS 20774 14. E. Betzig, G. H. Patterson, R. Sougrat, O. W. Lindwasser, S. Olenych, J. S. Bonifacino, M. W. Davidson, J. Lippincott-Schwartz, and H. F. Hess, “Imaging intracellular fluorescent proteins at nanometer resolution,” Science 313, 1642–1645 (2006). 15. M. J. Rust, M. Bates, and X. Zhuang, “Sub-diffraction-limit imaging by stochastic optical reconstruction microscopy (STORM),” Nature Methods 3, 793–796 (2006). 16. S. T. Hess, T. P. Girirajan, and M. D. Mason, “Ultra-high resolution imaging by fluorescent photoactivation localization microscopy (FPALM),” Biophys. J. 91, 4258–4272 (2006). 17. A. Egner, C. Geisler, C. v. Middendorff, H. Bock, D. Wenzel, R. Medda, M. Andresen, A. Stiel, S. Jakobs, C. Eggeling, A. Schönle, and S. W. Hell, “Fluorescence nanoscopy in whole cells by asynchronous localization of photoswitching emitters,” Biophys. J. 93, 3285–3290 (2007). 18. H. Shroff, C. G. Galbraith, J. A. Galbraith, H. White, J. Gillette, S. Olenych, M. W. Davidson, and E. Betzig “Dual-color superresolution imaging of genetically expressed probes within individual adhesion complexes,” Proc. Natl. Acad. Sci. USA 104, 20308–20313 (2007). 19. H. Bock, C. Geisler, C. A. Wurm, C. v. Middendorff, S. Jakobs, A. Schönle, A. Egner, S. W. Hell, and C. Eggeling, “Two-color far-field fluorescence nanoscopy based on photoswitchable emitters,” Appl. Phys. B 88, 161–165 (2007). 20. M. Bates, B. Huang, G. T. Dempsey, and X. Zhuang, “Multicolor super-resolution imaging with photo-switchable fluorescent probes,” Science 317, 1749–1752 (2007). 21. M. Bossi, J. Fölling, V. N. Belov, V. P. Boyarskiy, R. Medda, A. Egner, C. Eggeling, A. Schönle, and S. W. Hell, “Multicolor far-field fluorescence nanoscopy through isolated detection of distinct molecular species,” Nano. Lett. 8, 2463–2468 (2008). 22. J. Fölling, V. Belov, R. Kunetsky, R. Medda, A. Schönle, A. Egner, C. Eggeling, M. Bossi, and S. W. Hell, “Photochromic rhodamines provide nanoscopy with optical sectioning,” Angew. Chem. Int. Ed. 46, 6266 – 6270 (2007). 23. B. Huang, W. Wang, M. Bates, and X. Zhuang, “Three-dimensional super-resolution imaging by stochastic optical reconstruction microscopy,” Science 319, 810–813 (2008). 24. M. F. Juette, T. J. Gould, M. D. Lessard, M. J. Mlodzianoski, B. S. Nagpure, B. Bennett, S. T. Hess, and J. Bewersdorf, “Three-dimensional sub-100 nm resolution fluorescence microscopy of thick samples,” Nature Methods 5, 527–529 (2008). 25. J. Nowakowski and M. Elbaum, “Fundamental limits in estimating light pattern position,” J. Opt. Soc. Am. 73, 1744–1758 (1983). 26. K. Winick, “Cramer-Rao lower bounds on the performance of charge-coupled-device optical position estimators,” J. Opt. Soc. Am. A 3, 1809–1815 (1986). 27. R. E. Thompson, D. R. Larson, and W. W. Webb, “Precise nanometer localization analysis for individual fluorescent probes,” Biophys. J. 82, 2775–2783 (2002). 28. M. K. Cheezum, W. F. Walker, and W. H. Guilford, “Quantitative comparison of algorithms for tracking single fluorescent particles,” Biophys. J. 81, 2378–2388 (2001). 29. R. J. Ober, S. Ram, and E. S. Ward, “Localization accuracy in single-molecule microscopy,” Biophys. J. 86, 1185–1200 (2004). 30. A. d. Dekker, S. v. Art, A. v. Bos, and D. v. Dyck, “Maximum likelihood estimation of structure parameters from high resolution electron microscopy images : A theoretical framework,” Ultramicroscopy 104, 83–106 (2005). 31. R. A. Fisher, “Theory of statistical estimation,” Proc. Cambridge Philos. Soc. 22, 700–725 (1925). 32. C. Rao, “Information and the accuracy attainable in the estimation of statistical parameters,” Bull. Calcutta Math. Soc. 37, 81–89 (1945). 33. B. Richards and E. Wolf, “Electromagnetic diffraction in optical systems II. Structure of the image field in an aplanatic system,” Proc. R. Soc. Lond. A 253, 358–379 (1959). 34. D. R. Cox and D. V. Hinkley, Theoretical Statistics (Chapman and Hall, London, 1974). 35. H. Cramer, Mathematical Methods of Statistics (Princeton Univ. Press, Princeton, 1946). 36. A. G. Basden, C. A. Haniff, and C. D. Mackay, “Photon counting strategies with low-light-level CCDs,” Monthly Notices RAS 345(3), 985–991 (2003). 37. H. P. Kao and A. Verkman, “Tracking of single fluorescent particles in three dimensions: Use of cylindrical optics to encode particle position,” Biophys. J. 67, 1291–1300 (1994). 38. M. Born and E. Wolf, Principles of Optics (Cambridge Univ. Press, Cambridge, 2002). 39. F. Aguet, D. V. D. Ville, and M. Unser, “A maximum-likelihood formalism for sub-resolution axial localization of fluorescent nanoparticles,” Opt. Express 13,10503-10522 (2005). 40. M. G. Gustafsson, D. A. Agard, and J. W. Sedat, “I5M: 3D widefield light microscopy with better than 100 nm axial resolution,” J. Microsc. 195, 10–16 (1999). 41. A. Egner, S. Jakobs, and S. W. Hell, “Fast 100-nm resolution 3D-microscope reveals structural plasticity of mitochondria in live yeast,” Proc. Natl. Acad. Sci. USA 99, 3370–3375 (2002). 42. H. Gugel, J. Bewersdorf, S. Jakobs, J. Engelhardt, R. Storz, and S. W. Hell, “Cooperative 4Pi excitation and detection yields 7-fold sharper optical sections in live cell microscopy,” Biophys. J. 87, 4146–4152 (2004). 43. D. Khimich, R. Nouvian, R. Pujol, S. T. Dieck, A. Egner, E. D. Gundelfinger, and T. Moser, “Hair cell synaptic #102779 $15.00 USD Received 15 Oct 2008; revised 21 Nov 2008; accepted 24 Nov 2008; published 1 Dec 2008 (C) 2008 OSA 8 December 2008 / Vol. 16, No. 25 / OPTICS EXPRESS 20775 ribbons are essential for synchronous auditory signalling,” Nature 434(7035), 889–894 (2005). 44. A. Egner, S. Verrier, A. Goroshkov, H.-D. Sling, and S. W. Hell, “4Pi-microscopy of the Golgi apparatus in live mammalian cells,” J. Struct. Biol. 147(1), 70–76 (2004). 45. V. Westphal and S. W. Hell, “Nanoscale resolution in the focal plane of an optical microscope,” Phys. Rev. Lett. 94,143,903 (2005).


Introduction
It had long been believed that diffraction limits the resolution of basically any far-field optical microscope to ∼ 250nm.In the last decade, however, far-field fluorescence microscopy with a resolution of < 20nm has become a reality.The insight that Abbe's diffraction barrier [1] can be circumvented by utilizing the dye molecule's spectroscopic properties [2,3,4] ignited the implementation of several methods that are ultimately based on optical switching of light-emitting markers between bright and dark states.Switching allows sequential readout of spatial information from regions too small to be directly resolved by the imaging optics.The first such imaging scheme was stimulated emission depletion (STED) microscopy [2,3].STED switches off all dye molecules outside a defined, sub-diffraction sized region by irradiating them with redshifted light and thus effectively keeping them in their ground state.With a cylindrical doughnut for depletion, this technique is capable of multi-color imaging and delivers resolutions below 20nm [5].In order to suppress out-of-focus background and achieve optical sectioning confocal detection can be employed.3D superresolution is achieved, if the intensity pattern of the stimulating light surrounds the focal spot [6] but compared to the lateral case, much more light is needed in order to improve the axial resolution substantially below 100nm.Just as in any single-lens-based microscopy, the underlying reason is the incomplete use of the full solid angle both for irradiation and detection.The asymmetry is lifted by the 4Pi-arrangement [7] which uses a second objective lens and coherently adds the two counter-propagating wavefronts of the exciting light (4Pi type A), the emitted fluorescence (4Pi type B) [8] or both (4Pi type C) thus improving axial resolution by a factor of 3-7 [9].Applying the same concept to the stimulating beam in STED-microscopy results in dramatically improved z-resolution [10] and allows imaging with an isotropic resolution of ∼ 40nm [11].
Based on the same principle as STED, other methods were proposed and implemented that used emitters that can be optically switched between meta-stable on-and off-states and thus allow the collection of many photons from each dye molecule during a single switching cycle [12].While comparable resolution improvements to STED were not yet achieved using this approach, it entailed the discovery of a second very powerful path to sub-diffraction resolution by optical switching: Instead of switching off all emitters outside pre-defined regions, just a few, sparsely distributed molecules are allowed into their on-state.Because each dye usually emits N > 1 photons while in its on-state, the position of such an isolated emitter can be estimated ∼ √ N times better than the diffraction limit [13].The final image is then reconstructed as a histogram of accumulated molecule positions.
Several variants using this principle were reported in quick succession and called photo-activation localization microscopy (PALM) [14], stochastic optical reconstruction microscopy (STORM) [15], fluorescence photo-activation localization microscopy (FPALM) [16] and PALM with independently running acquisition (PALMIRA) [17].The latter used a continuous illumination scheme and, more importantly, demonstrated that matching exposure times with the average on-time of the emitters allows for much shorter acquisition times and application to 3D samples due to an improved signal to noise ratio.With present dyes allowing the detection of up to a few thousand photons per emitter during a switching cycle, lateral resolutions of approximately 20 − 40nm are reached.Using several of the various switchable dyes available, multi-color imaging is readily implemented [18,19,20,21].Three-dimensional imaging using this method was demonstrated by consecutive two-photon-activation of layers in a 3D-sample [22].In this approach the axial resolution is comparable to that of a two-photonexcitation microscope and does not match the ∼ 40nm resolution in the focal plane.On the other hand, in particle tracking, single, isolated emitters are routinely localized in three dimensions by exploiting the fact that the defocused, possibly aberrated image of a point-like emitter strongly depends on its axial position.The combination of these techniques with optical switching delivered three-dimensional resolution beyond the diffraction limit [23,24].However, the precision of axial localization and thus resolution along the optic axis remains much poorer than in the focal plane for the very same reason as in single-lens imaging: fluorescence light is collected from only one side, thus using only a fraction of the full solid angle for focusing and introducing asymmetry.It is therefore self-evident that the key to isotropic resolution lies in the use of two opposing lenses of high aperture angle.
In this manuscript we propose such a setup and show how its combination with multiple interference in the detection path [7,8] allows for unique position assignment and almost translation invariant resolution.We also present a rigorous quantitative comparison with previous approaches to three-dimensional nanoscopical imaging based on switching single emitters by comparing available strategies of microscopic three-dimensional localization of point-like sources of incoherent light.

Theory
The task of measuring the position or tracking the path of isolated emitters such as single molecules or fluorescent beads has been extensively investigated since the advent of highquality CCD cameras in the 1980s [25,26] and applied e.g. to microscopic particle tracking [27,28,29].Statistical models based on Poisson pixel statistics and Gaussian object functions have been considered thoroughly in this context for electron microscopy [30].How well a microscope can locate single emitters depends on the signal and noise levels and on the form of the microscope's detection point-spread-function (PSF) h(x, y, z) which is the normalized light intensity an emitter at position z on the optic axis in the sample causes at position (x, y) in the (back-projected) image plane.The estimation of lateral positions is usually straightforward.In most cases, the microscope's detection PSF h(r) is nearly invariant under lateral translation of the emitter.Therefore, locating the emitter is equivalent to finding a given intensity pattern in the image and estimating its position.Most algorithms do this by calculating the center of mass of the detection pattern or by fitting a suitable, parameterized model function to it.Background is considered by including it in the fit model or by subtracting an estimated value prior to the analysis.When tracking single particles in all three dimensions, an estimation procedure for the axial position is necessary which is somewhat more complicated, because the form rather than the position of the detected pattern changes with the emitter's z-position.As an example, when using ordinary wide-field detection, the axial distance from the focal plane can roughly be estimated from the degree of blurring of the detected spot but with obvious drawbacks: Estimates close to the focal plane z = 0 are very inaccurate as the image changes very little when varying the emitter's z−position (see Figure 1).Even worse, due to the symmetry of the PSF about the focal plane, the images of particles at positions +z and −z are indistinguishable even for larger z resulting in false assignments.Three-dimensional particle tracking thus relies on modified detection schemes which break the symmetry of the PSF.Their potential performance can be quantified independently of the actual implementation by determining the information content of a typical measurement [31]: For a given average number of detected photons and a given background level the Cramer-Rao uncertainty [32] gives a lower bound for the variance of any unbiased position estimation procedure.
Let our detection scheme consist of several optical detection channels with detection PSFs i zz,1 Fig. 1. xz-cuts of the PSFs h (top row) for an unaberrated PSF (a), in the presence of an astigmatism (b) and detection in two axially offset channels in order to break the axial symmetry (c).The second row shows xz-cuts of the differential axial information content i zz,k while the third row shows the combined differential information content averaged over all polar angles i zz .Please note that for the astigmatism case h(x, y, z) and thus i kz (x, y, z) are not rotationally symmetric in the xy-plane.For example, the yz-cut would be the mirror image about the focal plane of the xz-cut shown here.
h k (r).Each channel is imaged onto a CCD camera.The imaging system is then described by the normalized intensity that a point-like emitter at position r in the sample radiates onto pixel (i, j) at position (x i , y i ) in the back-projected image plane of the camera of channel k.Throughout the manuscript we calculate h using vectorial diffraction theory [33] assuming randomly polarized emission at a wavelength of 575nm and detection through an oil-immersion lens with a half-aperture angle of α = 64.5 • .We choose the normalization condition If N denotes the expectation value for the total number of photons detected from the emitter in all channels together, the expectation value for each pixel is given by where b is the average number of background photons per pixel normalized with the total signal.For the sake of simplicity we assume the number of photons per pixel, N i jk , to be also Poissondistributed with expectation values n i jk .The probability to obtain an image {N i jk } from an emitter at position r in the sample thus reads being the probability mass function of the Poisson distribution.Given a statistical model p(o|r) for measured data o conditional on the parameter vector r its information content can be measured using the Fisher information matrix I [31,34] which is given by where q and s index the elements of the parameter vector.Substituting Eq. ( 4) into (6) the integral over all possible outcomes o becomes the sum over all possible images {N i jk } and after a few algebraic conversions one obtains If the support, i.e. the size of the camera chip compared to the extent of the PSF is sufficiently large and the camera pixels sufficiently small, we can disregard the dependence on the lateral position and neglect effects due to discretization.Defining the differential information content for each diagonal element and channel yields In the case of aberrated detection introduced below the PSF is not radially symmetric.It therefore proves reasonable to also introduce the angle-averaged differential information content summed over all channels where (r, ϕ) are polar coordinates in the xy-plane.All the PSFs considered are symmetric about both the xz and the yz planes.It immediately follows that the summation over i and j vanishes in Eq. ( 7) for all s = q making the Fisher information matrix diagonal with its diagonal elements given by Here, we also introduced the count-rate normalized information content I q which is independent of the total number of photons detected and only depends on the form of the PSF and the relative background, b.Given the Fisher information, the Cramer-Rao-uncertainty [35,32] yields a lower bound for the covariance matrix of any unbiased estimator of r.In our case of a diagonal Fisher information matrix this simply means that the variance of any unbiased estimator is bounded by As expected, σ q scales with 1/ √ N for a given PSF and relative background.We note again the approximation that the photon numbers per pixel are independently Poisson distributed.The distribution of the total number N of detected photons depends on the experimental details.Depending on the exposure time, it may be well approximated by a Poisson or a geometrical distribution.The conditional distribution of the {N i jk } given N is then multinomial with probabilities h i jk .A quick calculation shows that in the absence of background noise and for fixed N a rigorous treatment results in the same Fisher information matrix as given above.We therefore expect only minimal corrections to the resolution values calculated below.In addition, most cameras introduce some kind of Gaussian read-out noise during digitization.Its quantitative effect is best accomodated by adjusting b to add Poissonian background with the same variance.At fast readout rates electron multiplying CCD cameras are often used.While they have small effective read-out noise they change the probability distribution of the signal through a gain process prior to digitization.For large gain factors the SNR is reduced by a factor of √ 2 [36] and a more detailed analysis reveals that the effect is largely equivalent to cutting the number of detected photons in half.

Detection schemes
Equation ( 10) is essentially a quantitative expression for the intuitive fact that the precision with which we can determine a particle's position along a given axis depends on how strong the PSF varies in that direction.Figure 1a depicts the PSF h and the resulting differential information content i zz for ordinary wide-field detection.As for all calculations throughout the paper we have assumed an average of 250 detected photons per objective lens and a homogeneous background of one photon per pixel (N = 250 and b = 1/250 for one lens, N = 500 and b = 1/500 for interferometric detection through two lenses).As outlined above, i zz vanishes at z = 0 and remains very small in the immediate vicinity due to the symmetry of h about the focal plane.Consequently the Cramer-Rao bound shown as the red, dashed line in Fig. 2b diverges at this position.An additional problem is the ambiguous position assignment which is also caused by the symmetry.All detection schemes devised for three-dimensional localization therefore aim at breaking this symmetry and avoiding z-planes with small information content in order to provide the best possible precision throughout a certain axial range.

Aberrated detection
A straightforward way to break the symmetry of the detection PSF without major changes to an existing imaging setup is the introduction of aberrations.While many types of wavefront distortions are possible candidates, astigmatism is especially appealing because it introduces a clear distinction between positions above and below the focal plane and is easy to achieve by simply placing a cylindrical lens in the detection path [37].The phase factor describing the deformation of the wavefront is given by the so-called aberration function [38] exp[iSρ 2 cos(2φ )]. ( where 0 ≤ ρ ≤ 1 and φ are normalized polar coordinates spanning the aperture of the objective lens.The resulting PSF and information content for S = 3 is shown in Fig. 1b.The symmetry is obviously broken removing the assignment ambiguity and leaving no z-position with vanishing i zz .We calculated the Cramer-Rao bounds for different strengths S of the astigmatism. As S increases, the breaking of the symmetry becomes more pronounced and the localization precision along the z-direction improves around the focal plane.But at the same time the lateral extent of the PSF increases and the PSF becomes asymmetric above and below the focal plane.This leads to a deteriorated and anisotropic lateral resolution.Figure 2 visualizes this tradeoff between axial and lateral precision and also shows that the interval along the optic axis over which good accuracy is obtained becomes shorter for increasing S. Being experimentally elegant and straightforward, this method has already been combined with single emitter switching microscopy [23] to yield sub-diffraction resolution along the optic axis.The setup described in [23] uses an astigmatism comparable to S = 2.  5) and 1 (6).The bounds for ordinary wide-field detection (S = 0) are plotted for comparison (red).

Defocused detection
The somewhat disadvantageous lateral anisotropy resulting from aberrated detection can be avoided when breaking the axial symmetry by distributing the signal on two or more unaberrated detection channels featuring focal planes which are offset with respect to each other [39].The lateral precision then remains isotropic, but because the PSFs are widened due to the defocus, there is a similar tradeoff between axial and lateral precision as for aberrated detection.Similarly to the case of astigmatic imaging, this tradeoff can be tuned with the free parameters now being the number of channels and the distance between the focal planes.Choosing more focal planes improves the homogeneity of axial at the cost of lateral precision and results in almost uniform performance over a larger interval along the optic axis (Fig. 3).Likewise, increasing the distance of the two focal planes in the two-channel setup trades off lateral precision for improved axial localization until optimal axial performance at z = 0 is reached at a distance of ∼ 600nm (Fig. 4).It should be noted that the exact behavior of the localization precision when changing the number of planes or their distance will depend on the SNR, that is the value of b.In particular, if the main source of noise stems from detection and is not split between channels, the overall SNR deteriorates when splitting the signal between several detectors.As observed in Fig. 3 this can lead to a deterioration of localization precision in some parts of or even over the whole axial range when adding more channels.From a practical point of view it should also be noted that using more than two channels brings about additional experimental complexity which will usually not be warranted by the expected resolution gain.In its two-channel version, the defocus method has successfully been combined with single emitter switching microscopy [24].

Coherent detection
Both, astigmatic and defocused detection have in common that axial precision is approximately 3 times worse than its lateral counterpart.The reason is simple: Because only a single lens and thus less than half of the detection aperture is used, the PSF is elongated along the axial direction.Therefore, its features are not as steep in this direction as they are laterally.The solution lies in the use of the a second, opposing lens and the coherent combination of both detected wavefronts as it has been proposed in 4Pi type B microscopy [7,8] and also realized in I 5 M [40].Besides considerably steepening the PSF along the optic axis (see Fig. 6) this arrangement has the added benefit of collecting about twice the number of photons from each emitter as compared to a single-lens setup.Figure 5 sketches such an interferometric detection unit.Initially a phase difference ϕ a is introduced to the electric fields emerging from the two objective lenses OL1 and OL2 before they interfere at a first beam-splitter cube.A classical 4Pi type B microscope detects only one of the channels 1 and 2 emerging from the beamsplitter.For imaging, ϕ a is usually used to ensure constructive interference in the focal plane in one of the channels but for localization this is not obligatory as long as the value of ϕ a is known.In such an arrangement the steepened features of the PSF result in vastly superior precision along the optic axis.Due to the second objective lens, twice as many photons are collected and thus the lateral performance is increased by roughly a factor of √ 2. However, just as for a Michelson interferometer, channel 1 exhibits constructive interference of the two  detection beams wherever they interfere destructively in detection channel 2. Therefore the intensity maxima of channel 1 coincide with the intensity nodes of channel 2 and vice versa as shown in Fig. 6.This results in an approximate axial symmetry of the combined channels about several points along the optic axis.In these planes, both channels have almost zero zderivative and their information content is very low (see Fig. 6).In fact, when choosing ϕ a to ensure constructive or destructive interference in the focal plane, exact mirror symmetry about the focal plane is restored resulting in diverging σ z at z = 0. Adding a phase difference of π/2 breaks this symmetry but still results in possible ambiguous assignment and positions of very bad performance at certain axial positions as shown in Fig. 7.
We solve this problem by adding two additional channels to the setup which we marked 3 and 4 in the figure.Instead of detecting the beams emerging from the beam splitter directly as in a 4Pi type B microscope, one half of the energy in their electric fields E 1 and E 2 is extracted by a pair of secondary beam-splitters while the other half is reflected and interferes at a tertiary beam splitter after another phase difference ϕ b has been introduced.The resulting two additional output beams are defined by the fields E 3 and E 4 .We have for the electric fields: and similarly For general phase shifts ϕ a and ϕ b we obtain and electing exactly ϕ b = −π/2 the expressions simplify to We can then re-write equations ( 13) and ( 16) and extract common phase factors ϕ k : with the relative shifts given by Δϕ k = π/2, 3π/2, 0, π.The four PSFs are then proportional to the intensity distributions where E(r) is the wide-field detection PSF and the matrix choice of ϕ b in Eq. ( 15) has now ensured that channels 3 and 4 have maximum z-derivative at the points of symmetry of channels 1 and 2 (see Fig. 6) thus leading to the almost uniform combined information content and thus localization precision along the optic axis as depicted in Fig. 7.The lateral localization precision improves for the interferometric solution due to the doubled amount of photons collected through two objective lenses.A closer look at the figure also reveals that it slightly deteriorates when adding the second cavity.This is due to the additional detection noise which is added to each of the two new channels.
While we see from equation (11) that the results of our analysis scale with the square root of the number of photons detected from each emitter at a given value of b and thus a given SNR, the effect of the SNR itself has a significant and more complex influence in all scenarios presented.In Fig. 8 we analyzed the behavior of the interferometric setup at different signal-tonoise levels.Not surprisingly, the introduction of more noise deteriorates precision everywhere but mainly outside the focal plane, where the PSF is wider and the signal level per pixel thus lower and more easily swamped by noise.In particular, the uniformity of localization precision over a large axial range, which is due to the favorable oscillating ring-like structure of the 4pi- Rao bounds for interferometric detection using just one beam splitter and two channels is also shown (blue, dotted).As before we have plotted the bounds for wide-field detection in a single channel for comparison (red, dotted).As above, introducing more channels can lower radial precision due to a smaller signal-to-noise level.
PSF is eventually lost.However, we should keep in mind that the SNR we assumed throughout our manuscript is close to that actually achieved in experiments and we therefore anticipate these calculations and the results shown in Fig. 7 to be realistic estimates of the practical resolution limit.
In order to confirm that practical estimators can be found which actually reach the theoretical Cramer-Rao bound we have performed Monte-Carlo simulations of localizations using interferometric detection.We drew 150 image realizations per z-position from the Poisson model of Eq. ( 4) and used the associated maximum likelihood estimator to back-calculate the molecule position ẑ.From the histograms of errors ẑ − z and their standard deviations, both shown in Fig. 9 over the range of z values considered, we can conclude that this most intuitive choice of estimator already yields unbiased position assignment and performs close to the theoretical limit.Most importantly the distribution of errors shows no indication of false assignments, i.e. assignments of an emitter above the focal plane to a position below which results in a similar detection pattern.This proves that all ambiguities have been removed by the two additional detection channels.We have also verified our results for all other detection schemes using the same Monte-Carlo technique (data not shown).
Just as in 4pi microscopy, the 6-fold boost in axial resolution comes at a certain cost.Using a second, opposing lens and the need for a stable phase relation between the two detection paths increases complexity, limits the maximum sample thickness and may render imaging of samples with large refractive index inhomogeneities more difficult.However, the experimental implementation of the 4-channel interferometric detection is largely equivalent to previous implementations of 4Pi microscopes.Matching of the optical detection paths and their alignment has been demonstrated in 4Pi microscopy of type C [42] and in multifocal 4Pi microscopy a constant phase relation between both paths was readily achieved over the whole field of view [41].Proper alignment and phase stability of channels 3 and 4 behind the second beam splitter brings about no additional challenges because the second cavity can be designed very small and thus stable and because the phase difference induced is fixed and independent of refractive index mismatch or sample dispersion.On the sample side, simultaneous 4Pi imaging above and below the nucleus of living cells [44] and of 3D samples with a thickness of up to 50 μm [43] have been successfully demonstrated and application of our proposed method to such samples should be equally feasible.

Conclusion
Interferometric detection results in a vastly superior, almost uniform precision of axial localization and improved lateral precision due to the higher collection efficiency.Because neither defocus nor aberrations are necessary to break the axial symmetry there is no tradeoff between lateral and axial precision.For 250 detected photons per objective lens and a background level of one count per camera pixel, which are currently typical conditions, a resolution of well below 10 nm in all three dimensions will be reached in a single-emitter switching based microscope.Depending on the number of detected photons, the objective lenses used, and other imaging parameters this number will vary but the approximately 6-fold improvement in the axial and 1.4-fold improvement in the lateral direction gained from introducing a second collection lens can be expected irrespective of the experimental details.For all methods, optimal precision is only reached for a certain range of axial positions.Therefore, they can be further improved by restricting the active molecules to an axially thin layer.A basic option is to cut the samples mechanically into thin slices, but this is not compatible with the imaging of living cells and it is experimentally complex.A more attractive option is the application of a multiphoton induced switching process that is inherently restricted to a layer of ∼ 800nm [22] combined with subsequent scanning of this layer along the optic axis.Interestingly, only the interferometric detection maintains its almost uniformly high resolution over such a large axial range while the performance of all other approaches will be compromised, either by rejecting events at the edges or by tuning to a poorer resolution which remains stable over the whole applicable range.Independently of resolution, switching on or exciting only in a single layer may become a necessity in densely labeled samples to avoid overlapping events and excessive background from out-of-focus fluorescence.Besides, the second lens can be used to coherently add the laser light inducing the excitation or activation coherently and thus squeezing the layer of switched on molecules at least 3-fold.Future, specialized dyes may well deliver more than ten thousand photons per switching event allowing position determination with accuracies of below 1 nm in our setup.The resolution is then no longer limited by the imaging process but by the size and movement of the molecular labels and by their interactions, making the proposed method another powerful variant of how to achieve molecular scale resolution [12,45]

Fig. 3 .
Fig.3.Radial Cramer-Rao bounds σ x , σ y (a) and axial bound σ z (b) for the defocused detection scheme with two or more channels with different focal planes (solid).The different graphs correspond to setups with 2, 4 and 6 channels as indicated by the numbers in the plot.Their focal planes were assumed to be evenly spaced with the highest and lowest one at z = ±300nm.As before, the performance for wide-field detection in a single channel has been plotted for comparison (dotted).

#Fig. 5 .
Fig.5.Schematic illustration of an interferometric 4-channel detection setup.The light is collected by two opposing objective lenses OL1 and OL2.A first beam splitter is used to combine the beams from the two objective lenses.The phase retardation ϕ a is used to control the relative phase of the two beams, e.g. in order to produce constructive interference in the focal spot in beam E 1 and destructive interference in E 2 .By overlaying half their intensity at a second beam-splitter, two additional beams are produced.Their relative phase shifts can be offset relative to those of the first two channels using a second phase retardation ϕ b .

Fig. 6 .
Fig. 6. xz-cuts of the PSFs (top row) for wide-field detection (a) and the different channels of the interferometric setup (b).The second row shows the differential information content which is much higher for interferometric detection due to the steepened axial features of the PSF.Please note the change of scale as compared to Fig. 1.Using just one beamsplitter and combining the two resulting channels results in approximate axial symmetry and almost vanishing information content in the minima and maxima of the PSFs (second row).Only the combination of all four channels leads to almost uniformly high information content along the optic axis (bottom row).

Fig. 7 .
Fig.7.Radial Cramer-Rao bounds σ x , σ y (a) and axial bound σ z (b) for the interferometric detection scheme (blue) in comparison to the two-channel defocus setup (red) with two channels focused at z = ±300nm and an astigmatism of strength S = 3 (black).The Cramer-Rao bounds for interferometric detection using just one beam splitter and two channels is also shown (blue, dotted).As before we have plotted the bounds for wide-field detection in a single channel for comparison (red, dotted).As above, introducing more channels can lower radial precision due to a smaller signal-to-noise level.

#Fig. 9 .
Fig. 9. Monte-Carlo simulation of a maximum-likelihood estimator in a 3D single switching event based microscope using 4-channel interferometric detection.(a) Mean squared error of the axial position determined during 150 simulated localizations per individual axial value (grey) and corresponding Cramer-Rao bounds (blue).(b) Histogram plot of the errors showing no side maxima at any axial position which would indicate systematic false assignments.
in far-field optical microscopy.