The quest for ultimate super resolution

With the wealth of super-resolution techniques available in the literature it is useful to provide a succinct review of the general concepts involved in the different schemes. In this paper we group super-resolution schemes into several broad categories to simplify comparison, and to elucidate the factors limiting their respective resolutions.


Introduction
As long as there have been microscopes, researchers have been pushing the frontier of optical resolution in a quest to image ever smaller objects. Success might mean unlocking the secrets to biological processes, materials properties, and other mysteries. Early on, the optical wavelength was known to be a limit [1][2][3][4], and this motivated the push toward ever smaller wavelength imaging systems, like x-ray or electron beams [5,6]. However, in the latter part of the 20th century, it was realized that other options existed [7][8][9][10][11][12][13][14][15][16]. Beginning with near-field scanning probes, images were produced that did not appear to have a wavelength limit [17][18][19]. Then, superresolution techniques came on the scene, which made use of nonlinearities to push far-field image resolution beyond the diffraction limit [20][21][22][23][24][25][26]. Now there are so many superresolution schemes on the scene, each claiming to out-perform the others, that a casual observer could easily be confused into thinking that there are no fundamental limits to achievable resolution [3]. In this paper, we briefly summarize the resolution limits of existing techniques by dividing them into broad categories so that the reader will clearly understand the current limits and options for the future.
The paper is divided into several sections. First is an introduction to the important concepts involved in superresolution. Second is a discussion of incoherent super-resolution schemes. Several schemes are discussed, and the role of nonlinearity in these super-resolution techniques is emphasized. Third, coherent super-resolution schemes are discussed, and the factors determining their resolution are explained. Examples are used to simplify comparison of coherent and incoherent [27][28][29][30] schemes. Fourth, super-resolution schemes are discussed that do not beat the diffraction limit encompassed by Maxwell's equations, but nevertheless are sub-wavelength limit. Finally, other related topics are briefly discussed.

Important concepts
Before beginning, it is necessary to define what is meant by resolution. Normally, two objects are considered to be resolved when two conditions are satisfied: (1) the two objects can be distinguished from one another, and (2) The location of each object is known to higher precision (ideally accuracy) than their separation. In the most common definition of diffraction-limited resolution, the Rayleigh criteria, both of these conditions are satisfied simultaneously. This is illustrated in figure 1 for two point objects. Here, each object independently produces a sinc function response at the detector. Original content from this work may be used under the terms of the Creative Commons Attribution 3.0 licence. Any further distribution of this work must maintain attribution to the author(s) and the title of the work, journal citation and DOI.
When one object lies at the position corresponding to the first node of the other object's (sinc) response, then the sum of the two sinc functions (solid line) develops a dip in the middle. At this separation, curve fitting can determine both the number and locations of the objects, and so both of the above criteria are met. Note that curve fitting can sometimes be used to resolve objects slightly closer than this, but ambiguities quickly arise as the distance decreases, especially when noise is present.
In super-resolution, the conditions of distinguishability and precision can often be addressed separately. For example, in what is perhaps the most widely-used super-resolution technique (PALM/STORM [24][25][26], which is discussed later), the location precision is determined by the centroid position, while the distinguishability is achieved by associating different color or time bins with different objects [24][25][26][31][32][33][34]. Note that in this case, distinguishability need not have anything to do with the diffraction limit. Conversely, other super-resolution techniques (e.g., STED/GSD [20,21,23], also discussed later), simultaneously achieve both distinguishability and sub-diffraction precision.
Noise is a crucial concept in super-resolution. Without noise, it is possible to devise super-resolution schemes which appear to have no resolution limit. For example, the separation of two point sources could be exactly determined using curve fitting if noise were absent. Therefore, although superresolution schemes are often described without reference to noise, in an attempt to simplify the conceptual analysis, their ultimate resolution limit must include a noise analysis. To see this, consider a popular definition of resolution as being the maximum spatial frequency in the image [35,36]. In figure 2 we illustrate how this definition is lacking in the presence of noise [37]. Here figure 2(a) shows two different spatial frequencies whose relative amplitudes are adjusted such that they have the same slope at the origin. The higher spatial frequency is better at distinguishing nearby objects, which satisfies the first requirement of super-resolution as stated above. However, the second requirement which relates to location precision does not necessarily have a clear relation to spatial frequency. This is illustrated in figure 2(b) where the same noise amplitude is added to both spatial frequencies. In general, the precision to which the centroid of a noisy image spot can be determined is given by the noise amplitude divided by the steepest slope of the curve. For simplicity, figure 2(b) compares regions of steepest slope in two sine waves. Since the slopes and noise amplitudes are the same for both spatial frequencies, curve fitting will give the same location precision (or resolution) for both. Now since distinguishability need not have anything to do with position resolution, for example using different colors, then the resolution is independent of spatial frequency in this example. Thus the spatial frequency definition of resolution is incomplete without a noise analysis.
Another example illustrating the tradeoff between noise, spatial frequency, and resolution is shown in figure 3. By blocking low spatial-frequency components in a conventional lens-based imaging system, the spatial frequency is increased but the localization precision is significantly degraded because of the resulting lower light-levels. This leads to a worse location precision than when the lower spatial frequencies are also allowed to pass. Hence the resolution increases even as the average spatial-frequency decreases.
Finally, it should be noted that the concept of image contrast is often considered to be separate from resolution, placing its own limits on what information can be extracted from an image. However, the two are actually closely related, since a low-contrast image with noise-free background could easily be converted to a high contrast image by subtracting off Figure 1. Diffraction-limited resolution of two objects in an imaging system that gives a sinc (=sin(x)/(x)) function (dashed line) response for a point object. The solid line is the total intensity as a function of position in the image plane. A two-hump pattern is seen in the total intensity curve, which indicates both the presence of two objects and their separation. In this example, one object is at the node of the sinc function centered on the other object (dashed curve). One has a high-frequency and a small-amplitude, while the other has a low-frequency and a large-amplitude, such that they have the same gradient at the origin. (b) When a constant noise is added and curve fitting is used to compute the location of the zero-crossing, the precision is the same for both spatial frequencies since it is given by the noise divided by the gradient. the background. The disadvantage of a low contrast image is that its signal to noise ratio is determined by the noise on the background, which is often larger than the noise on the interesting parts of the image.

Incoherent super-resolution techniques
In the photo-activated localization microscopy and stochastic optical reconstruction microscopy (PALM and STORM, respectively) super-resolution schemes, location precision is determined by centroid-based or curve-fitting techniques. While centroid techniques have been around for a long time [38], the distinguishability needed to resolve objects separated by less than the diffraction limit has historically posed a major hurdle. Initially, this problem was addressed by attaching to each object a different colored fluorophore [39]. However, due to the relatively broad spectrum of optical emitters at room temperature, only a handful of resolvable colors are possible. To overcome this limitation, in PALM/STORM, time is used instead of frequency (color) [24][25][26][31][32][33][34] for providing distinguishability, as illustrated in figure 4(a). Briefly, fluorophores with a limited stability are first activated, and then illuminated until they bleach. The diffraction limited image spot produced by each fluorophore (large spots in figure 4(a)) is then used to calculate a centroid (small spots in figure 4(a)). By sensitizing only a few of these bleachable fluorophores at any one time, it is insured that no more than one emitter contributes to each diffraction-limited image spot. When these fluorophores have bleached and their corresponding centroids have been computed, then another set is sensitized, and the process repeated until the full super-resolution image is constructed.
Note that the centroid computation method can sometimes diverge for certain diffraction limited imaging functions. However, this is not a problem in practice as computational work-arounds exist, and in any case a region of interest is often defined to be within a few diffraction widths to avoid interference from nearby emitters.
Another major incoherent super-resolution technique is stimulated emission depletion (STED) [20]. To see how STED works, first consider figure 4(b). Here a fluorphore is excited with a laser beam having a donut shape (i.e. a node in the center). Treating the fluorophore as a two-level molecule, strong laser excitation produces a competition between fluorescence and stimulated emission which causes saturation of the fluorescence at the high intensity regions of the laser spot, as illustrated in figure 4(b). This saturation causes steepening of the fluorescence intensity gradient near the center of the donut beam. Alternatively saturation can be viewed as producing a higher effective spatial frequency [35]. For the saturated donut beam, both interpretations give the same answer; that the location precision increases with the square root of the peak laser intensity, where the square root is a consequence of the quadratic intensity distribution near the node of the laser donut beam (i.e. the effective node size is determined by position at which the laser pumping rate is equal to the incoherent decay rate).
Although a saturated donut beam is capable of superresolution, it is mainly restricted to relatively simple objects with only a few fluorophores within each diffraction limited spot. This is because the fluorphores outside of the node region still give a fluorescence signal that contributes to noise. There is also ambiguity from the steep fluorescence gradient at outer edge of the donut beam. These limitations are overcome by Role of spatial frequencies in the diffraction limit. When all but the edges of the lens are masked off (top row), the imaging system has a higher spatial frequency, and therefore higher resolution in the absence of noise (middle column). However, when photon noise is included (right column), adding lower spatial frequencies (middle and bottom rows) is found to greatly improve the localization precision, while having almost no effect on distinguishability.
STED as illustrated in figure 4(c). Briefly, the donut beam is only used to de-excite via stimulated emission, but not to excite the molecule [20]. This is possible because the phonon sidebands of most dye molecules allow absorption and stimulated emission at very different wavelengths (or colors). For excitation (or absorption) of the fluorphore a uniform (Gaussian) probe beam with a moderate (non-saturating) intensity is used instead of a donut beam. For de-excitation a donut beam is still used and as its intensity increases it dominates over both the excitation rate of the probe and the fluorescence emission rate. The result is that fluorescence is quenched everywhere except in the region near the donut beam node. The resulting fluorescence lineshape becomes narrow, with no background. This is illustrated in figure 4(c). Note that the position precision is the same as for the single donut beam excitation of figure 4(b), but now the elimination of background gives the distinguishability needed to satisfy both requirements for super-resolution. Although the laser intensity requirements for STED can be severe, a variant of STED, ground state depletion (GSD) uses stimulated emission into a metastable (non-fluorescing) state to greatly reduce the donut beam intensity needed [23].

Resolution of incoherent techniques PALM/STORM and STED/GSD
The resolution of the centroid techniques PALM/STORM is given by the location precision, since distinguishability is achieved by time separation. As stated above, the precision is the noise divided by the intensity gradient. The intensity gradient is bounded by the maximum intensity (of the point source image) divided by its (diffraction-limited) width. Thus, the precision (and resolution) is improved by increasing the signal to noise ratio (SNR). It can be shown that the maximum SNR is given by the square root of the total number of photons N in the image spot. The number of photons in turn depends on the pumping rate W (absorption and re-emission rate) divided by the bleaching frequency g bleach (inverse bleaching time). Therefore, the resolution improvement R over the conventional diffraction limit is given by [3]. For STED/GSD, the resolution is determined by the effective size of the saturated node in the donut beam. This is in turn determined by the position at which the de-excitation pumping rate W begins to exceed the natural decay rate G for excited state in STED, or g meta for metastable decay rate in GSD (i.e. the pumping rate is 'thresholded' by the relevant decay rate). For a donut beam, as mentioned above, the intensity increases quadratically away from the node, so a straightforward calculation gives a resolution improvement R over the diffraction limit which goes as the square root of the max pumping rate divided by the decay rate, /g = R W meta [3] for GSD.
Since PALM/STORM and STED/GSD have similar formulas for resolution improvement, it is of interest to see if there is an example that could relate the width of the saturated hole in the donut beam (see figure 4(b)) to the total number of photons collected. This is shown in figure 5(a). Here the light grey trace shows a donut-beam image with Poissonian intensity noise. By greatly magnifying this (black trace), the This produces higher spatial frequencies. This is shown for peak laser pumping rates of 1, 10, and 100. (c) In STED the excitation laser is replaced with a weak Gaussian-shaped probe beam. The de-excitation laser is also a strong donut beam. The result is background-free super-resolution. This is shown for peak donut beam intensities of 1, 10, and 100. donut-beam node appears to become narrow, producing an effectively higher spatial frequency, as in figure 4(b). The question is how to set the threshold, since no saturation nonlinearity has been assumed in the emitter. Examining the dark trace of figure 5(a), it is seen that nonlinearity of the photon detection process itself gives a discrete number of thresholds; namely there must be an integer number of photodetection events per pixel. When the threshold is set at the single photon level, the approximate width of the saturated node in the donut beam matches the position precision given by centroid-based curve fitting. Hence even in PALM/ STORM, where resolution enhancement might seem like it requires a non-linearity, closer examination shows that the photo-detection process itself provides the nonlinearity.

Transition from incoherent to coherent super resolution techniques
To make the connection between coherent and incoherent super resolution, recall that STED/GSD made use of a saturation nonlinearity to improve resolution. A driven twolevel atom (or molecule) can also be saturated by strong excitation. This is illustrated in figure 5(b) for a damped twolevel atom, where it is illustrated that donut beam excitation can produce narrow features suitable for super-resolution. Although the damped two-level system can give superresolution, the real advantage to coherent systems appears when the damping is small. This is illustrated in figure 6. Since optical transitions are always heavily damped at room temperature, the example in figure 6 uses a radio-frequency (RF) spin transition as the two-level system. The donut laser beam is replaced by a donut-like (in 1D) RF excitation, produced by applying RF to a pair of anti-Helmholtz coils. A strong resonant RF field drives the spins from ground state to excited state and back again periodically in time by a process known as Rabi population oscillation [40]. If the interaction time is fixed and the resonant RF field amplitude varies in space, the excited-state population will depend on position. For example, near the coils where the field is strong many Rabi oscillations will have taken place, but at the field node none will take place (see figure 6) [16]. The result is that the excited state population versus position can have a high effective spatial frequency that in-principle can give superresolution. In practice however, a single sinusoidal image function is not useful for imaging, and so something more is needed.
To make the coherent two-level system useful for imaging, an approach analogous to STED/GSD is used, wherein excitation and de-excitation are done with fields of different frequencies; namely a strong donut-like field is applied at DC, and a weak RF field is used to excite the two-level atoms. This is illustrated in figure 7. In contrast to STED, the donut field does not de-excite the atoms, but rather tunes the atoms out of resonance with the RF field to prevent excitation. At the node of the donut field, the atoms can be excited by a resonant RF field giving a STED-like image, as shown by the object at the node in figure 7. The big advantage of a coherent system is that this position selectivity is not limited to only the node, but can be located anywhere between the coils by simply shifting the RF frequency, as illustrated by the second object (on the right) in figure 7. Note that this RF selectivity gives distinguishability in addition to position precision, and that these improve with increasing DC field gradient. This imaging scheme is known as magnetic resonance imaging (MRI) which is a special case of gradient field imaging (GFI) [36,41,42].

Resolution of coherent super-resolution schemes
For the Rabi gradient scheme of figure 6, the resolution is given in-principle by the spatial frequency, which is determined by the coil separation divided by the maximum number of Rabi population oscillations near the coils, N . R This in turn is determined by the product of the interaction time t and the Rabi frequency m W = B, where m is the magnetic moment and B is the RF magnetic field strength. The interaction time is ultimately limited by the inverse spin decay rate, s This gives a super-resolution improvement factor of /g = = W R N . R s Note that at RF frequencies the coil separation s is analogous to the diffraction limit, as will be discussed later, because the RF wavelength is very long.
For MRI ( figure 7), the resolution is determined by the spin linewidth g s divided by the field gradient, / D s, where D is the maximum RF detuning that can be produced the DC fields near the coil. Again, taking s as the effective diffraction limit gives a resolution improvement factor of /g = D R . s To relate W and D, consider two spin levels which are initially degenerate. Applying a DC magnetic field along the quantization axis gives an energy level splitting of m D = B, where here B is the DC magnetic field strength. This is known as the Zeeman shift. If the same magnetic field is applied perpendicular to the quantization axis, population oscillations are produced between the two spin states at the Larmor frequency m W = B. L Since a DC field can be considered resonant with the degenerate spin levels, this Larmor oscillation is analogous to Rabi oscillation. Thus, the resolution improvement of the Rabi gradient and the MRI schemes are essentially the same.

Near-field super-resolution techniques
The classic resolution limit was obtained for propagating plane waves in the far-field limit. Near-field techniques can of course achieve a higher resolution. In the absence of any nonlinearities, the resolution limit of these techniques is determined from Maxwell's equations [45]. To illustrate this, consider the example of a scanning near-field illumination source that excites two objects as shown in figure 8. As seen, the two objects can be resolved when their separation is larger than the size of the illumination source, or its distance away, whichever is larger. In the far-field, the source size is replaced by the propagation wavelength in Maxwell's equations. Hence, the Maxwell resolution limit encompasses not only the far-field diffraction limit, but also scanning probe techniques like near-field scanning optical microscopy [46] and variants. It should be noted that near-field imaging is often called super-resolution because it is not limited by the wavelength. However, true super-resolution goes beyond the Maxwell limit by making use of nonlinearities [47]. Here it should be noted that Maxwell's equations relate the maximum field gradient to the maximum field amplitude divided by the source size (or distance), or wavelength in the far field. Thus the definition of position precision as a gradient divided by the noise appears the most convenient metric for comparing different super-resolution schemes (at least for the position precision criteria) [3]. Here the donut laser beam is replaced by a field that crosses zero, thereby giving a radio-frequency (RF) intensity node analogous to a 1D cross-section of a donut laser beam. For strong RF drive, the resulting Rabi population oscillations produce a high spatial-frequency in excited state population. The damped two-level atom response is superimposed. Figure 7. In magnetic resonance imaging, a DC magnetic field gradient allows position-selective excitation of the two-level spin system by a resonant RF field. In this setup, anti-Helmholtz coils create the field gradient, which is depicted by the dashed curve. The horizontal coil (top) is used for exciting spins in the objects as well as for measuring their magnetic response [43,44]. The response of the two objects shown is plotted as solid green and blue curves, which are the images. . An optical field generated by a single near-field source is scanned close to two objects. The resulting scattered light (solid curve) can resolve the objects if their separation is larger than the source size R 0 or source distance R, whichever is larger. For reference, the image of a single object is shown by the dashed curve.

Dielectrics and refractive index
In dielectric materials, optical illumination induces dipoles in the atoms comprising the material. These induced dipoles can be envisioned as near-field sources, and so should enable resolution beyond the free-space diffraction limit. Indeed this is the case. This continuous distribution of sources effectively modifies the speed of light in the medium, and with it, the maximum slope of the electric field. This is quantified by defining an index of refraction n, which is the factor by which resolution is enhanced. Although this is in fact super-resolution, it is usually considered to be encompassed by the diffraction limit.
Negative index materials are often studied for superresolution applications [48][49][50][51][52]. Most negative-index superresolution demonstrations are done with metals because they have negative electric permittivity. This is because a metal has free electrons and acts like it has a resonance at DC, with a width given by the collision frequency. For objects embedded in a metal-based negative index material, the resolution limit would be expected to be improved by the quality factor of the DC resonance, which is approximately the optical frequency divided by the collision frequency. However, the large absorption in metals would tend to limit this super-resolution to small distances from the objects, because the field gradient decays as the field decays.
To achieve full super-resolution performance, it is preferred that negative index materials have both m and  negative. This happens above some resonance frequency in the material [53,54], where both electric and magnetic resonances are needed. Such materials are generally rare in nature. Notable exceptions would be materials having optical transitions with both magnetic dipole and electric dipole components. In practice, custom engineered meta-materials are normally for this purpose [55]. However, a review of this field is beyond the scope of the current paper.

Nonlinearities in the propagation medium (laser filamentation)
In addition to object and detector nonlinearities, it is possible to perform super-resolution using nonlinearities of the intervening medium. The most common example of this is to use the self-focusing nonlinearity to create laser filaments near the object being illuminated [56,57]. In this case, the resolution limit is the size of the filament, which is determined when diffraction and self-focusing are balanced (see figure 9). Note this can also be viewed as near-field illumination where the source is produced by a nonlinearity instead of by a fixed-size object as in figure 8. The advantage of this technique over those requiring nonlinear response of the object is that here the illumination power reaching the target could in principle be lower, and therefore could be less damaging to the object. Of course, laser filamentation in air generally requires an intense laser, but in some media the threshold might be greatly reduced. For example, by placing nanoparticles inside the medium [58,59] optical forces could modify the refractive index in the direction needed for self-focusing, even at very low excitation powers.

Quantum and quantum-inspired super-resolution
Quantum imaging has also been proposed for super-resolution, though mostly in the context of lithography. It involves using the fact that multi-photon Fock states interfere to form effective standing waves that have n times as many nodes a conventional optical standing wave, where n is the number of photons in the Fock state [60]. For its implementation, materials are needed that absorb only the Fock state of interest (i.e. will not absorb single photons, but only n photons). It happens that such materials do exist [61][62][63][64], and surprisingly even when these materials are illuminated by intense classical light, they achieve the same resolution as when illuminated by pure Fock states [65,66]. Here it should be mentioned that a number of quantum-inspired super resolution technique have been proposed recently, for example those that make use of quantum interference as in Raman dark states (CPT/EIT) [67][68][69][70][71][72]. Although these schemes do make use of the quantum mechanical properties of the object, their super-resolution enhancement factors do not beat classical limits.
Quantum enhanced imaging can also take advantage of multi-photon correlations [73][74][75]. Centroid methods that involve multiple photons have been demonstrated [76]. Finally, quantum illumination [77] is a technique that is capable of achieving exponential improvement (2 n where n is the Fock state number) in the signal-to-noise ratio (SNR) of a weak image in the presence of strong background light [78]. Since noise and resolution are closely related, this may in principle be adapted in the future to give a dramatic resolution enhancement. Figure 9. Self-focusing of a laser beam by a nonlinear medium. The laser is self-focused by the intervening medium, sharpening the breadth of the exciting field. In the figure, two objects are placed next to one another. The self-focusing of the laser enables interaction with only one of the objects. Here, the resolution limit is determined by the width of the self-focused beam.

Conclusion
In summary, a number of super-resolution techniques were compared. In the incoherent cases, the resolution enhancement was found to be given by a ratio of a pumping rate W and the linewidth of some decaying (preferably metastable) level g, specifically g = R W . For the coherent schemes, the enhancement was given by the ratio of Rabi frequency W to transition linewidth g, g = W R . To relate these two performance factors, note that for a strongly damped two-level atom, the pumping rate is given by / * = W G W , 2 where * G is the de-coherence or damping rate. Substituting this into the incoherent limit gives * g = W G R , which bears a much closer resemblance to the coherent case. Perhaps not surprisingly, coherent gradient-field techniques like MRI have the best super-resolution performance, not only because spin transitions have narrow linewidths compared to the shifts induced by typical gradient fields, but also because their coherence allows super-resolution at all locations in the gradient field; not just near a node. At room temperature, optical coherence times are generally too short for coherent super resolution which is why spin transitions must be used for MRI. Finally, it is noted that MRI is an example of a superresolution scheme that makes use of both near-field enhancement and nonlinearity. For this reason the conventional picture of MRI does not invoke the concept of wavelength, but rather gives the resolution in terms of linewidth divided a field gradient. For this, and other previously-mentioned reasons, we have avoided concepts like wavelength and spatial frequency as a unifying concept for super-resolution, and instead stress the use of field (or intensity) gradients.
We gratefully acknowledge support of the NIH SBIR #HHSN26820150010C, National Science Foundation Grant EEC-0540832 (MIRTHE ERC) and the Robert A Welch Foundation (Award A-1261). JSB is supported by the Herman F Heep and Minnie Belle Heep Texas A&M University Endowed Fund held/administered by the Texas A&M Foundation.