The Correlation Confocal Microscope

A new type of confocal microscope is described which makes use of intensity correlations between spatially correlated beams of light. It is shown that this apparatus leads to significantly improved transverse resolution.


Introduction
The development of the confocal microscope [1,2] led to a revolution in microscopy. The insertion of source and detection pinholes leads to improved resolution and contrast and the ability to image thin optical sections of a sample in a noninvasive manner. This has made the confocal microscope ubiquitous in modern biomedical optics research, where it is vital not only for imaging, but also for dynamic light scattering ( [3,4,5]), fluorescent correlation spectroscopy ( [6,7]), and other types of experiments. Because all of these experiments rely on achieving the smallest confocal volume (the overlap of the images in the sample of the source and detector pinholes), a great deal of effort has gone into improving the resolution of confocal microscopes in order to minimize this volume. One common approach, which exploits the idea of correlated excitation, is two photon microscopy [8,9], in which a pair of photons must be absorbed by a fluorescent molecule simultaneously; since this only happens with appreciable probability where the photon density is very high, only the central part of the confocal volume contributes, leading to a reduced effective confocal volume.
In this paper we investigate the question of whether the advantages of intensity correlation methods at detection and of confocal microscopy can be combined. We will show that use of transverse (lateral) spatial correlations and of coincidence detection can significantly improve the resolution of a confocal microscope. Some care needs to be taken in how this is done since following the naive method of simply sending pairs of spatially correlated photons into a standard confocal microscope will not work; the pinhole destroys all spatial correlations. So instead, we send in uncorrelated photons and use a form of postselection to enforce correlations among the photons we choose to detect.
It is to be noted that the correlation method described here shares in a sense a common underlying philosophy with two-photon microscopy, since the two-photon microscope also uses spatial correlations, but at the excitation stage: uncorrelated photons are inserted into the microscope at the source, but the requirement that the two photons interact with the same fluorescent molecule effectively enforces spatial correlations among the photon pairs that contribute to the detected signal. We do something similar here, but use a different method in order to enforce the correlations at the detection stage. This will reduce the need for high intensities, thus allowing the use of less powerful lasers, as well as reducing possible damage to the sample.
It should also be pointed out that the idea of using intensity correlations with entangled photon pairs has been been applied before to obtain subwavelength microscopic resolution [31], though not in conjunction with confocal microscopy. This previous work, known as quantum microscopy, required entangled states, in contrast to the method here which will work with a completely classical light source.
The outline of the paper is as follows. In section 2, confocal microscopy is briefly reviewed. Section 3 discuss the problem of combining confocality with correlation. In section 4 we introduce the setup for a generalized version of the correlation confocal microscope, and show that it does indeed lead to significant improvement in resolution over the standard (uncorrelated) confocal microscope. In order to make the principles of operation clearer, we will initially consider in section 4 the unrealistic case of a generalized microscope that requires two identical copies of the object being viewed; in section 5, we then show how to reduce the apparatus to the realistic case of a single object. Section 6 looks at the axial resolution, with conclusions following in section 7.

The Standard Confocal Microscope.
The basic setup of a standard confocal microscope is shown in figure 1. (For a more detailed review see [32]). The two lenses are identical. In real setups they are in fact usually the same lens, with reflection rather than transmission occurring at the sample. (In this paper we will for simplicity always draw the transmission case, but most of the considerations will apply equally to the reflection case.) This lens serves as the objective; it has focal length f and radius a, and serves to focus the light going in and out of the sample. The sample is represented at point y by a function t(y); depending on the setup, t(y) will represent either the transmittance or reflectance of the sample. At the first lens, the distances are chosen so that the imaging condition is satisfied; as a result, light entering the microscope through the source pinhole is focused to a small diffraction-limited disk (actually a three-dimensional ellipsoid) centered at a point P in the sample. Any stray light not focused to this point is blocked by the pinhole, thus providing the first improvement in contrast between between P and neighboring points. The distances at the second lens also satisfy the imaging condition, so the second lens performs the inverse of the operation carried out by the first one, mapping the diffraction disk in the sample back to a point at the detection plane. The pinhole in this plane blocks any light not coming from the immediate vicinity of P, thus providing further contrast. Together, the two pinholes serve to pass light from a small in-focus region in the sample and to block light from out-of-focus regions. The in-focus point is then scanned over the sample. The end result is a significant improvement in contrast over the widefield microscope. The double passage through the lens also leads to improved resolution. To quantify the resolution improvement, we need to look at the impulse response function h(y) and transverse point-spread function (PSF) of the microscope. Let h i (ξ , y) (i = 1, 2) be the impulse response functions for the first and second lenses individually (including the free space propagation before and after the lens). Up to multiplication by overall constants, these are of the form wherep(q) is the Fourier transform of the aperture function p(x ′ ) of the lens. q and k respectively denote the transverse and longitudinal momenta of the incoming photon. We assume that q << k. From now on, we will also assume a circular, abberation-free lens of radius a, in which case:p where q is the magnitude of q and J 1 is the Bessel function of first order. Applying a pinhole at one end, we may also define Imagine that a sample is being scanned by the microscope. The amplitude impulse response for the microscope while focused at sample point y is If we insert a sample which is nontransmitting except at a single point, t(y) = δ (y), the impulse response becomes the coherent spread function with the corresponding point-spread function given by Setting the distances at the first and second lenses to be equal for simplicity (z ′ i = z i for i = 1, 2), use of equation (3) gives us ([32, 33]): to be compared with the widefield microscope, which has Sincep is a sharply-peaked function, the higher power in the confocal result leads to a further increase of sharpness and a resulting improvement in resolution.

Correlations versus Confocality.
We now wish to introduce spatial correlations into the confocal microscope. However, we immediately run into a problem: the source pinhole is ideally a delta function in position space, which means that its Fourier transform is a constant in momentum space. So, regardless of the spectrum of transverse spatial momentum q entering the pinhole, the spectrum leaving the pinhole is approximately flat; all transverse momentum correlations are lost. Thus it seems that we must choose between keeping either the correlations or the source pinhole, but not both. However, the problem may easily be avoided in several ways. One category of solution involves removing the source pinhole and preserving the confocality by other means. Notice that the only purpose the source pinhole serves is to make sure that all of the light entering the microscope is focused at the same point in the object plane. But this can be achieved without a pinhole and therefore without destroying spatial correlations. This can be done in several ways: (1) One method, often used in the standard confocal microscope, is to have the beam hit the lens parallel to the axis and focus to a point one focal length f away (figure 2a). We can then introduce correlations by arranging for pairs of narrow, well-localized beams of light to strike the lens at equal distances from the axis.
(2) A second method is to use pairs of photons produced by spontaneous parametric downconversion at correlated angles, so that if they are traced backward they both seem to emanate from the same point (figure 2b); that point would be then be analogous to the pinhole. The crystal would have to be far enough from the lens for the cross-section of the pump beam to appear to be pointlike.
(3) A third method is to have a very narrow beam reflecting from a fixed point on a rotating mirror, then pass through a beam splitter. The two beams then have anticorrelated directions, and both trace back to the same illuminated point on the mirror. The mirror, as it rotates, fills the entire lens aperture with light over time.
The latter two methods are used to enforce spatial correlations in quantum ( [16]) and classical ([23]) versions, respectively, of correlated-photon or "ghost" imaging. In all three methods, the light would be focused at a point satisfying 1 In all of these setups we can think of the arrangement as providing an effective or virtual pinhole, even though there is no pinhole physically present. In each case, we may proceed formally in the same manner that we would if there was an actual source pinhole. There is a problem with these approaches, though. Although correlation is maintained up to the location of the first lens, diffractive effects at the lens then destroy a portion of the correlation before the sample is reached. Taking into account the possibility of aberration in the lens as well, the benefits deriving from these methods seem to be limited.
However a completely different class of solution seems more promising: we can take the light to be initially uncorrelated when it enters the microscope, but then select out pairs of photons which happen to be at the same spatial distance from the axis, thus reducing the likelihood of detecting pairs which are at different transverse distances. This effectively enforces correlation at the location of the sample, where it is needed.
In order to enforce the correlation, we need to tag the photons in some manner according to where they happen to intersect the sample, selecting the value of some property of the photon in a manner that depends on position in the sample plane. One way to do this would be to construct filters with narrow, position-dependent pass bands; then, if broadband light is used for illumination, the light leaving the sample will be segregated, with photons of different frequency leaving different points of the sample. Thus, the frequency of the detected light provides an indicator of position and can be used to enforce spatial correlation.
This frequency-tagging method will be discussed in more detail elsewhere. In the current paper, we instead discuss in detail an apparatus in which we tag our photons, not by frequency, but by phase. The two beams are then combined at a beam splitter, converting the phase-tagging into correlation. Effectively, we have two confocal microscopes in the two branches of a Mach-Zehnder interferometer; the interferometer serves to compare the photon phases and to suppress pairs that differ by a large phase. A pair of detectors and a coincidence counter then count the number of pairs that survive this comparison.
Note that interferometry is not an essential element of the basic correlation confocal scheme, but only of the particular embodiment of it that we discuss in the following sections; it appears only because we chose to tag the photons by phase. If we use frequency-tagging instead, no interferometry would be required. The frequency scheme could be implemented, for example, by spatially separating different frequencies at the output with a diffraction grating; then a coincidence count would occur when the pixels at the same relative position are simultaneously triggered in two CCD cameras.
In the next section, we introduce correlations into a confocal microscope via matching of phases and then examine the use of the resulting correlation confocal microscope to scan over a sample.

The Correlation Confocal Microscope.
Here we introduce the correlation confocal microscope. To make the operating principle clearer, we initially work out the formulas for an "unfolded" version of the microscope in which the apparatus has two branches and two identical objects, one in each arm; later we show how to reduce this apparatus to the more useful version with a single arm and single sample. The basic setup is shown in figure 3. The locations in the transverse planes at the first lens, second lens, and image plane are represented respectively by x ′ , x ′′ , and x. Note that x ′ and q are related by q k = x ′ z 2 . Subscripts 1 and 2 on q, x ′ , or any variable other than z will denote which branch (upper or lower) of the apparatus is being referenced. t 1 and t 2 represent the effect of the samples S 1 or S 2 in the two arms. In the object plane, y ′ will denote an arbitrary point in the plane, and y will be the point actually being viewed in the object; in other words, −y is the displacement vector of the object during the scan. (The minus sign appears in our convention in order to keep the image from being inverted; alternatively, +y could be thought of the displacement of the microscope, with the sample is held fixed.) The arguments y 1 and y 2 of the sample functions t 1 and t 2 will be partially linked by the spatial correlation that we will impose; we will find that this leads to an improvement of the lateral or transverse resolution.
In front of the sample in each branch, introduce a linear position-dependent phase shift for some constants c and b. We assume that b is a radial vector pointing out from the axis, so that the φ i (y) depend only on the magnitude y = |y|. The necessary phase shifts may be produced by a graded index material; for a material .1 mm to 1 mm thick, the gradient of index versus radial distance required is of the order of ∆n ∆r ∼ 10 −4 − 10 −3 mm −1 , which is well within the range of what is currently technologically feasible. Alternatively, the refractive index gradient could be achieved by inserting conically-shaped optical wedges in front of each sample. We now go through the operation of this apparatus. Uncorrelated photons are input at the left. Suppose for simplicity that they are all in the same state, represented by the creation operator a † 0 . The first beam splitter (BS 1 ) splits the beam between the upper arm (1) or lower arm (2) with equal probabilities. The creation operators in these branches are related to the initial state creation operator byâ † Given photons in each of the two branches (1 and 2), the creation operators are multiplied by the impulse response functions for passage through the confocal microscope in that branch. The creation operators for the upper ( j = 1) and lower ( j = 2) branch just before the second beam splitter are then given So if two photons are input at BS 1 , then by using equations (12)- (13) we see that the state incident on the second beam splitter is proportional to We may then substitute into this result the fact that where 3 and 4 represent respectively the used and unused output ports of BS 2 . Taking the inner product of the resulting expression with the state having two photons leaving BS 2 through port 3, we have the two-photon amplitude in branch 3: In the last expression, we have set φ 1 ≡ φ , φ 2 ≡ −φ , and assumed that the two objects or samples are identical (t 1 = t 2 ≡ t). The amplitude may be put in the form (up to an overall multiplicative constant): Note that the expression in the square brackets equals −1 when φ (y ′ ) = φ (y ′′ ) = π 4 and vanishes when φ (y ′ ) = φ (y ′′ ) = 0. So if we arrange for φ (y) to drop from π 4 at the axis to zero at the edge of the Airy disk, this expression will strongly suppress values of y ′ and y ′′ that fall far from the axis, near the edge of the Airy disk (i.e. at the first zero of the Airy function). This is the key to the resolution enhancement. (In the notation introduced earlier, this means c = π 4 and b = π 4R airyr .) If the light is reflecting from the sample (as opposed to being transmitted through it), then it will pass through the phase modulation twice, so the modulation should be half as large in this case.
When two photons simultaneously emerge from BS 2 into arm 3, the final beam splitter BS 3 simply routes them (50% of the time) to two different detectors, so that a coincidence count may be measured. The result is that, up to overall constants, the coincidence rate will be: where A 3 is given by equation (18). Note that the coincidence rate does not actually give the image of |t(y)| 2 , as would normally be obtained from a microscope, but rather it gives the image of |t(y)| 4 . So the square root of the coincidence count must be taken before comparison with the images from other microscopes. As a special case, we can obtain the PSF by taking t(y) to be a delta function. Keeping in mind the square root mentioned above, this then gives us: where y is the magnitude of y. We see that the factor sin 2 (φ (y)) is responsible for the improvement of resolution compared to the standard confocal microscope. This factor suppresses the counting of photon pairs with values of y ′ and y ′′ that differ significantly from each other; this, when combined with the factors ofp, reduces the contribution to the coincidence rate of points in the outer part of the Airy disk. Note that all of this comes about because the factors involving the trigonometric functions in eq. (18) provide a coupling between the integration variables y ′ and y ′′ . Without this coupling, which amounts to a spatial correlation between the photons detected from the two branches, eq. (18) factors into two independent integrals, each of which looks like the amplitude of a standard confocal microscope. Thus, without the correlation, nothing is gained that could not be obtained from simply using a standard confocal microscope and squaring the output intensity.
The improvement introduced in this manner may be seen from numerical simulations shown in figures 4-6. The dotted curves are the transmittance of an object: two square functions separated by a gap in fig. 4, three square functions in fig. 5, and a more complicated shape in fig. 6. The solid curve (blue online) gives the square root of the coincidence rate of the correlation confocal microscope, while the dashed curve (red online) gives the output intensity of a standard confocal microscope. It can be seen clearly in these figures that the correlation microscope produces output that matches the original object significantly more closely than the standard confocal microscope. In particular, in figures 4 and 5 the correlation microscope clearly distinguishes the separate square objects objects and correctly gives their relative heights, whereas the standard confocal microscope simply produces a single blurred peak. Similarly, in fig. 6 the correlation microscope clearly detects the presence of the "shoulder" on the left-hand curve; the same feature is invisible to the standard confocal microscope. Setting the sample equal to a delta function, we obtain fig. 7, which shows the transverse intensity PSF. It exhibits a roughly 60% decrease in width, compared to the standard confocal microscope.
Despite the substantial improvement in resolution, one problem with this proposed microscope is also apparent: any aberrations in the lenses or any aberrations induced by the sample     will disrupt the phase-matching, and so could reduce the benefits of this scheme. If we chose to use the frequency-tagging scheme mentioned earlier, instead of the phase-tagging version discussed here, a similar degradation could be caused by frequency dispersion in the sample. Thus, it may be advantageous to attempt finding a variation of this setup which could incorporate dispersion cancellation ([17, 18, 19]) or aberration cancellation ([20, 21, 22]) techniques.

Reduction to One Sample.
The setup in the previous section is unrealistic in the sense that it requires two identical copies of the object or sample in order to function. Here we go to the realistic case: we show how the apparatus may be altered without affecting the basic principle of operation, in order to require only a single copy of the sample.
The new setup is shown in fig. 8. All the light now passes through the same sample, gaining a phase shift e +iφ (y) . But then the beam is split at BS 1 , after passing through the detection pinhole; half of the beam continues onward to the second beam splitter BS 2 unaltered, while the other half is deflected downward to a phase conjugating mirror. The mirror reverses the sign of the phase and deflects the beam back to BS 1 . The half of this beam that is transmitted through BS 1 at this second encounter then recombines with the unaltered beam at BS 2 ; from this point on, all is as it was for the two-branch version of the previous section, assuming t(y) is real. In the lower branch between BS 1 and BS 2 , either a high-density optical filter or another 50/50 beam splitter must be inserted in order to equalize the intensity in the upper and lower branches.
If t(y) is complex, one copy of it will also be complex conjugated at the mirror. The factor t(y ′ + y)t(y ′′ + y) inside the integrals of eq. (18) now becomes t(y ′ + y)t * (y ′′ + y); since sample-