Full-field cavity enhanced microscopy techniques

Quantum enhanced microscopy allows for measurements at high sensitivities and low damage. Recently, multi-pass microscopy was introduced as such a scheme, exploiting the sensitivity enhancement offered by multiple photon-sample interactions. Here we theoretically and numerically compare three different contrast enhancing techniques that are all based on self-imaging cavities: CW cavity enhanced microscopy, cavity ring-down microscopy and multi-pass microscopy. We show that all three schemes can lead to sensitivities beyond the standard quantum limit.


I. INTRODUCTION
Cavity enhanced measurements are ubiquitous in science and technology. In microscopy, the offered sensitivity enhancement has for example been exploited in cavity scanning microscopy [1,2], in Tolansky interferometry [3] and in multi-pass microscopy [4,5]. While the former represents a point scanning technique, in which a fiber based microcavity is scanned across a sample, the latter two offer a full field of view. In Tolansky interferometry, cavity enhancement is achieved by placing a flat mirror at a slight angle on top of the sample, which also has to be highly reflective. The incoupled light bounces back and forth between the two mirrors and the interference between multiple reflected beams is highly sensitive to the distance between the specimen and the mirror. Using this simple technique metallic surface topographies are routinely characterized on the nm level. However, the angle in between the two mirrors leads to beam walk-off and therefore to a non local response. This reduces the achievable transverse resolution to a few wavelengths of the probe light [6]. This can be avoided if the sample is placed in a self-imaging cavity [7,8], as done in multipass microscopy [4], a geometry that allows for 2D imaging and that is applicable to a wider range of samples, as long as photon loss is small.
Here we analyze such cavity enhanced measurements based on self-imaging cavities and differentiate between three different regimes: The continuous wave scheme (CW), in which a continuous beam of light is in-and outcoupled into the self-imaging cavity via one of its end mirrors. The ring-down scheme (RD), in which a pulse of light is incoupled into the self-imaging cavity and a fraction of it is outcoupled every time the pulse interacts with one of the semi-transparent end mirrors of the self-imaging cavity. The detection can either be done in a time-resolved way, in which the number of interactions is recorded for each detected photon, or in a time-integrating way. The multi-pass scheme (MP) [4], in which a pulse of light is incoupled into the selfimaging cavity and interacts with the specimen exactly m times before it is outcoupled and detected. We first discuss these techniques analytically in the matrix optics formalism and derive expressions for the expected signal strength of bright-field (BF), dark-field (DF) and Zernike phase contrast (Znk) microscopy measurements (section II). We then apply our findings to the cavity enhanced detection of mono-and few-atomic films of different materials, such as carbon and boron nitride (section III), and analyze the performance of each technique in terms of the achievable signal-to-noise ratio (SNR) per absorbed photon. We show that cavity enhanced microscopy techniques outperform classical microscopy techniques in these terms. This agrees with results from the quantum measurement community [9,10], where it has been shown that multi-passing represents a quantum optimal approach to phase measurements [11] that allows overcoming the shot-noise limit and approaching the Heisenberg limit [12,13]. Besides the sensitivity enhancement for the detection of weak signals, cavity enhanced microscopy techniques will thus be of great interest for the study of photo-sensitive materials, for which a higher SNR cannot be achieved by using more probe photons. One example is live cell microscopy, where it has been shown that, even at visible wavelengths, long term observations or high-intensity microscopy techniques can cause cell death of a large fraction of a cell population [14]. MP microscopy has also been proposed for transmission electron microscopy [15], where sample damage sets bounds on the spatial resolution obtained for structural biology [16].

II. THEORETICAL MODEL
A setup for cavity enhanced microscopy is shown in Fig. 1. The following analysis will be restricted to the paraxial ray-optics regime, which is a good approximation in cases not affected by the diffraction limit. It will further be restricted to scalar fields. In this idealized scenario, the self-imaging cavity between the two mirrors at z = 0 and z = 8f is comprised of infinite thin lenses for perfect imaging, two idealized beam splitters as a model for the semi transparent mirrors, and a thin refractive index profile representing the sample plane in the center of the arrangement at z = 4f . Given an input light field coupled in from the left (z = 0), the optical response of the sample will be encoded in the amplitude and phase incoupling and an outcoupling mirror (Mi and Mo, respectively). The lenses are spaced such that, according to their respective focal lengths f1...4, a microscope is formed on either side of the sample plane S. For simplicity we will restrict the following analysis to the case where fi = f , resulting in unity magnification on either side of the sample. When the sample is illuminated from the left, a mirror-flipped image will be formed on Mo and the reflected light will be re-imaged onto the sample, which is now illuminated with an image of itself. After multiple m interactions, light is either actively or passively outcoupled through Mo and imaged using the microscope to the right of Mo, where additional optics in the Fourier plane allow for dark field and phase imaging.
of the field outcoupled through the right mirror (z = 8f ) after multiple cavity roundtrips and sample interactions. First the analytic expressions for the outcoupled field and the energy absorbed by the sample will be derived in section II A and section II B, respectively. In section II C, a possible post-processing in a subsequent 4f lens arrangement for Zernike and dark-field imaging [17,18] will be discussed.
A. The 8f imaging cavity Let us start by modelling the self-imaging cavity and the effect on the light field as it bounces between the mirrors and repeatedly interacts with the sample. The input field illuminating the first mirror is assumed to be a broad Gaussian mode with central frequency ω, a waist w greater than the sample dimensions, and a possibly time-dependent input power P in (t), We will study both a continuous wave scenario with timeindependent input power and a pulsed scenario. In the latter case, we restrict our considerations to moderate pulse lengths: They can be short enough to prevent the fields of subsequent roundtrips from overlapping, while they are still sufficiently long to neglect the variation of the longitudinal wavelength, ω = ck, for all frequencydependent interaction processes. The longitudinal phase factor exp(ikz) is omitted as we also operate in the paraxial regime.
Each mirror shall be treated as a simple beam splitter with amplitude reflectivity and transmissivity parameters r 1,2 = − R 1,2 and t 1,2 , where T 1,2 = |t 1,2 | 2 and R 1,2 T 1,2 . The propagation of the field through the lenses between mirror and sample planes is described by an ideal (infinite-aperture) 4f transformation [18] that produces the inverted image, A typical sample inserted into the cavity would be given by a thin layer on top of a transparent carrier plate of known refractive index n g and thickness d g f . Absorption and diffraction within the carrier are neglected. The sample layer that is to be detected shall be described by a two-dimensional refractive index profile n(x, y) of thickness d. The real and the imaginary part of n(x, y) will be imprinted in the detection signal of the microscope, i.e. the phase and the amplitude of a probe light field. Unlike in single-pass imaging, which 'sees' only the transmission profile of the sample (and holder), the present two-mirror multi-pass scheme requires us to take also the reflectivity of the sample into account. In the macroscopic limit of perfectly resolved sample structure, the transmission and reflection can be obtained by solving the boundary conditions at the interfaces of sample layer and holder material per 'pixel' (x, y) on the sample plane. For readability, we will omit the argument in the following and abbreviate n s = n(x, y), keeping in mind that all expressions are defined per pixel.
The carrier glass slab is characterized by the transmission and reflection coefficients [19] t g = 4n g e i(ng−1)kdg (n g + 1) 2 − (n g − 1) 2 e 2ingkdg , For a clean signature of the substrate, the carrier can be made perfectly transmissive (non-reflective) by choosing its optical thickness to be a half-multiple wavelength, n g kd g = jπ, t g = (−) j e −ikdg , r g = 0. In this case, we obtain relatively simple expressions for the transmission and reflection coefficients of the whole sample, Notice the difference in the reflection of fields impinging on the substrate side (L) and on the back side (R). All reflected and transmitted field components are defined relative to the incident field on the sample plane, z = 4f , which is set to be the interface between the substrate layer (to the left) and the carrier plate (right).
As the main application of the scheme is to enhance weak optical signatures, we focus here on optically thin substrate layers, |n s |kd 1. To lowest order, their optical response can be characterized by the susceptibility function where the real part represents the sample-induced phase shift and the imaginary part the extinction of the incident field amplitude. The above sample coefficients are approximated by It turns out that the validity of these linearized expressions is limited in practice when it comes to the quantitative analysis of multi-pass imaging of thin films (see Sect. III). Nevertheless it can serve as a qualitative estimate for the signal enhancement in multi-pass imaging, and we shall occasionally refer to this as the weak-sample (WS) scenario later. In order to describe the transformation of an arbitrary input pulse at the 8f -cavity-sample system into a (possibly overlapping) sequence of output pulses, we can make use of the matrix optics formalism [20]. Given the light fields E L→ , E R← impinging on the sample plane from the left and right (with the arrows marking the propagation direction), the sample interaction is described by a linear map for each sample pixel, Here, we assume that the sample interaction and the cavity properties are approximately constant over the pulse spectrum (i.e. determined by their values at the central frequency of the pulse). The passage back and forth through the 4f lens systems and reflection at the two outer mirrors can be expressed by E L→ (t + 8f /c) = r 1 E L← (t) and E R← (t + 8f /c) = r 2 E R→ (t). We arrive at the following transformation matrix, for each sample pass followed by a half roundtrip. At this point, we shall introduce the eigenvalues of this matrix for later use, λ ± = r 1 r s,L + r 2 r s,R ± (r 1 r s,L − r 2 r s,R ) 2 + 4r 1 r 2 t 2 s 2 .
(9) Given an incident pulse E in (t) that is initially coupled in through the left mirror (z = 0), the forward-running pulse amplitude on the sample plane (z = 4f ) after m ≥ 1 passes through the sample reads as E (m) Note that the coordinate inversion by the first 4f transformation according to (2) leaves the gaussian input field (1) invariant. The outcoupled train of pulses at the second mirror (z = 8f ) is simply obtained by taking the sum over m and multiplying with the transmission of the second mirror, Again, the output field is inverted with respect to the sample plane, i.e. the sample pixel (x, y) is imaged onto (−x, −y).
The input-output transformation can also be given in Fourier space, which for a fixed light frequency ω amounts to a stationary illumination, i.e. infinite pulse length. Given the temporal Fourier transform E in (ω) of the input field (1) the transmitted output field becomes It follows either by carrying out the sum in (11) in Fourier space, or by directly solving the combined boundary value problem at the sample plane and the mirrors. Additional losses in the cavity, e.g. at the lenses, can be included by setting R 1,2 + T 1,2 < 1.
Our main focus here are samples with weak optical response, which implies a low overall reflectivity, |r s,(L,R) | 2 |t s | 2 1. However, the degree to which the sample reflectivity influences the multi-pass image depends also on the reflectivity of the sample holder and on the number of roundtrips. If the 8f cavity is of high finesse, i.e. supports many roundtrips, multiple sampleor holder-reflected fields interfere and may have a significant impact on the cavity resonance and on the output field.
In the following, we distinguish two complementary regimes for cavity-enhanced microscopy by comparing the characteristic duration τ of the input pulse P in (t) to the half round-trip time 8f /c. A quasi-stationary frequency-domain description, Eq. (12), applies in the continuous-wave (CW) limit τ 8f /c, whereas a time-domain treatment, Eq. (11), of individual nonoverlapping pulses is more suitable for τ < 8f /c, i.e. in the multi-pass (MP) and ring-down (RD) cases. The intensity of the outcoupled light is then either given by the interference of many field components or a sum of individual pulses.
The sample response in the output field can be made explicit in the WS limit (6). Using the approximation 1 + x ≈ e x for |x| 1, we obtain to lowest order The terms χ and kd g are to be evaluated by their values at the mean pulse wave number k. The time-domain expression (14) splits into contributions associated to odd and to even numbers of sample interactions. The latter terms describe the light that is reflected at the sample, whereas the former correspond to one pass and additional full roundtrips in the resonator, i.e. to m = 2 + 1 sample interactions, a total phase shift of (2 +1)χ R , and an extinction of (2 + 1)χ I per pixel. The signal enhancement by the number of passes is the key feature of the studied MP imaging scheme, as we will discuss below.
In the limit of stationary illumination, the mirror system acts like a resonator, and the signal enhancement is expected to scale with the cavity finesse, i.e. the average number of photon round-trips, see Sect.III A.

B. Sample damage
Apart from optical resolution, a key limitation for microscopy with sensitive biological samples is the damage induced by photon absorption. It sets the gauge for comparing the performance of multi-pass imaging and conventional single-pass microscopy: We can rank the performance of different microscopy schemes by the SNR of the (phase or absorption) images they produce at a fixed threshold value for the overall sample damage. In the following we will assume that the damage is proportional to the amount of energy that is absorbed by the sample.
The net absorbed power per sample pixel is formally obtained by summing the inward-oriented Poynting vectors left and right of the sample plane, assuming that the sample holder is transparent. Here, this amounts to comparing the forward-and backward-running intensities, In the case of stationary illumination, this is directly proportional to the sample damage rate. The fields left and right of the sample follow by solving the boundary conditions and can be expressed in terms of the output field (12) at pixel (−x, −y). We arrive at a damage rate proportional to the cavity-enhanced output intensity, For time-dependent input fields, the overall absorbed energy Q abs per pixel (corresponding to N abs = Q abs / ω absorbed photons) is obtained by integrating the intensity (15) over the interrogation time and the pixel area. We conveniently express the fields on both sides of the sample in matrix notation, using (7) and where each summand represents the field after m sample interactions. A handy result is found in the case of nonoverlapping roundtrip pulses. Given the temporal power profile P in (t) of the input pulse with characteristic duration τ and a small pixel area in the center of the pulse profile, A w 2 , the input energy per pixel is Assuming also a constant (average) sample response over the size of each pixel, the absorbed energy per pixel accumulated after m interactions reads as In the WS limit (6), we find that the stationary absorption rate is proportional to the intra-cavity intensity times the absorption strength of the sample, For short pulses, a self-explanatory WS expression arises if R 1,2 = R, C. Amplitude and phase measurements In this ideal scenario of perfect resolution (neither limited by finite apertures nor by a finite pixel size in the detector) the output field carries full information about the local amplitude and phase modulation for each point on the sample plane. The measurement sensitivity would be limited only by shot noise. Depending on what information is to be extracted, we distinguish three detection schemes: a direct measurement of the local output intensity to image the light extinction profile of the sample (BF), a background-free dark-field measurement of the diffraction profile (DF), and a Zernike phase measurement [17] (Znk). In the short-pulsed regime, the detection signals are sequences of pulses arranged according to the number m of interactions with the sample (i.e. half roundtrips through the 8f imaging cavity). We shall refer to their individual per-pixel energies as Q (m) BF,DF,Znk± . In a multi-pass scheme where the mth pulse is outcoupled by a specific triggered mechanism, and not through the second cavity mirror, the factor T 2 in Q (m) BF,DF,Znk± must be replaced by the transmission efficiency of the outcoupler.
In the bright-field case (BF), the time-resolved detection signal will be determined by the absolute square of the field (11). For non-overlapping short roundtrip pulses, the square of the sum of the fields reduces to a sum of squares, and we obtain the bright-field signal once again evaluated at the mirrored image pixel of the sample. This rather featureless expression exhibits a dichotomic behavior between odd and even numbers m of sample interactions. The input light and the most significant sample response appears in the transmission signal after full cavity roundtrips, i.e. odd m = 2 + 1. The signal after an even number of interactions implies at least one reflection at the sample (or sample holder) and is thus of higher order in its optical response.
In BF microscopy of weak samples, most of the output light is just the transmitted input beam distributed over many roundtrips, with a small sample-induced modulation. We thus define the actual sample signal in each pulse relative to a reference Q (m) ref , which could be another spot on the detection plane with a different sample profile (in a differential measurement), or an empty reference pixel. In the latter case, we obtain the reference signal from the output field of the 8f imaging cavity and an empty sample plate, assuming homogeneous illumination. It has the same form as (11), but with the reflection and transmission coefficients (3) of the empty glass plate in place of the sample terms. The eigenvalues of the corresponding round-trip matrix are They simplify to ± √ r 1 r 2 e −ikdg in the non-reflective case, For non-reflective holders it is nonzero only after odd multiples m = 2 + 1, where it simplifies to . Once again, we get a clearer picture in the WS limit (6). Expanding the eigenvalues (9) to lowest non-vanishing order in χ and integrating over the pulse duration, we obtain the BF signal for odd sample interactions, i.e. full roundtrips. The result is negative due to the accumulated extinction of the input pulse at the sample, while the phase response does not enter this first order expression. In fact, the validity of the approximation is restricted to not too many roundtrips and to samples with significant absorption, m|χ| 2π and χ 2 R χ I . The signal in between full roundtrips at even m = 2 is comprised of light reflected at the sample and is therefore of second order, In dark-field and Zernike phase imaging, the outcoupled field (11) passes another 4f configuration before detection, hitting either a small absorber (DF) or a phase plate (Znk) in the Fourier plane at the distance 2f behind the exit mirror. The 2f transformation of a paraxial field amplitude yields the spatial Fourier transform [18,21], Being subject to a thin absorbing or phase-shifting plate, the field is then multiplied by a transmission function [1 − b(x, y)], modulating its amplitude or phase where b(x, y) = 0. This is followed by another 2f transform leading to the detection field, The dark-field image of a homogeneously illuminated sample structure is obtained by blocking the undiffracted forward component from the outcoupled field. This can be realized here by placing an absorbing element in the origin of the Fourier plane (see Fig. 1), e.g. a circular obstacle with radius > f /kw, so thatb(q) = J 1 (q ) 2π /q.
If the relevant sample size is much smaller than the Gaussian waist w of the incident probe field (1), we can choose an absorber size that blocks only the undiffracted beam and lets almost all the diffracted light pass. The blocked field (29) is then approximately given by the output field of the 8f imaging cavity with an empty sample plate. Note that the sample pixel is now imaged onto the same pixel on the detection plane.
In the short-pulse limit, the dark-field detection signal becomes Q (m) The output pulses associated to even and odd sample interactions are now of the same magnitude, and there is no need to subtract another reference term. In the WS limit (6), the even orders are identical to (26) before, and the odd ones are also of second order in the weak sample response, For the Zernike phase contrast method, the opaque plate in the Fourier plane behind the exit mirror is replaced by a phase plate that shifts the undiffracted background field component by ±π/2 [17]. The field arriving at the detector in the Zernike scheme can be understood as a superposition of the background-free darkfield signal and the π/2-shifted undiffracted field without sample. Depending on the sign of the phase shift the technique is referred to as negative (Znk-) or positive (Znk+) phase contrast microscopy. We obtain the signal from the DF case by inserting a complex prefactor, b(x, y) → (1 ∓ i) b(x, y). Repeating the above approximation steps then yields Q (m) Once again, we subtract the bright offset from the actual sample response, because, contrary to the DF case, the phase plate does not remove the reference signal (24). The Zernike configuration can provide strong signals even for weak phase shifts of optically thin, transparent samples, as the phase response now appears in first order after full roundtrips. The WS limit yields It has the same form as the BF signal (25), but with ∓χ R instead of χ I .

D. Enhanced phase estimation by multi-passing
We have shown that the phase or extinction signature of weak optical samples is generally enhanced linearly (in BF and Znk schemes) or quadratically (DF) by the number m of times a probe field interacts with the specimen in the imaging cavity. This gain in measurement sensitivity with respect to the shot noise-limited accuracy of a single-pass microscope becomes apparent if we view the WS imaging as a parameter estimation problem.
In the absence of extinction losses and sample holder, a WS imprints the phase χ = χ R onto the coherent probe light upon each interaction. This phase can be estimated in the Znk+ scheme, where the purpose of the phase plate in the outcoupling stage is to interfere the phase-shifted component of the probe field with the unshifted one. The MP scheme implements the sequential application of the phase shift to one of the two interfered components. We then estimate the phase after m passes by means of the difference between the detected photon numbers of the sample pixel and an empty reference pixel. Using simple error propagation of the respective shot noise, we get a mean estimate and error [11,22] with N and N ref the mean photon numbers of sample and reference pixel, respectively. Given that the latter are of about the same magnitude, we find δχ ∼ 1/ √ 2N ref m. If sample damage is an issue, one can adjust the input intensity for a fixed number of photon-sample interactions, i.e. constant N ref m. In this case, which will be studied in detail below, the error scales like 1/ √ m. For an equivalent CW or RD detection scheme, the same proportionality holds with 2 1 instead of m.

III. SIGNAL TO NOISE AT CONSTANT DAMAGE
We will now compare the various imaging modalities in terms of signal to noise at constant damage. As an illustrative example we will discuss the use of cavity enhanced microscopy for the characterization of ultrathin films of carbon and boron nitride (BN). Density functional theory calculations yield an index of refraction of graphene of 2.71 + 1.41i [23]. At this wavelength BN has a refractive index of 1.8 [24], where the imaginary part is negligible due to the large bandgap of 5eV [25]. The susceptibility of the two materials can be obtained from Eq. (5). The thickness of a monolayer of graphene and BN is 3.35Å and 3.33Å [24], respectively.
In traditional microscopy these samples provide very low contrast and their detectability depends strongly on the thickness of the substrate and the probing wavelength [26,27]. Already now, several optical techniques are being used to characterize thin film growth [28], and interferometric multibeam interference schemes, such as Tolansky interferometry [29], are used to enhance the measurement sensitivity [30]. The self-imaging capabilities of the cavity enhanced microscopy techniques discussed here offer the advantage that spatial film thickness variations can be detected locally (and in principle at diffraction-limited resolution).
So let us assume a sample with a substrate layer spatially varying in thickness or material. In order to detect such variations, we shall select two areas of the substrate that differ in their optical responses and image them onto two different pixels, (x 1 , y 1 ) and (x 2 , y 2 ). Depending on the imaging scheme, N 1,2 photons will be detected on the two pixels, respectively (where the photon number is given by the pulse energy divided by ω). In a differential measurement, and assuming shot noise in the photodetector, the SNR of detecting the variation will then be SNR = |N 1 − N 2 |/ √ N 1 + N 2 . We will evaluate and compare it at a fixed damage level for continuous-wave, ring-down and multi-pass microscopy, using the various imaging modes discussed in the previous chapter. For simplicity, we focus on the case studied in the previous section where the second pixel corresponds to an empty reference area, χ(x 2 , y 2 ) = 0. At uniform illumination, this is equivalent to imaging a single pixel with and without the substrate layer. We assume uniform illumination of the relevant pixels with the same incident light energy Due to the low sample absorption and high number of roundtrips considered, the WS approximation will not always be valid; we use it to discuss the qualitative scaling of the signal enhancement. The numerical simulations are based on the full expressions derived in the previous section. For all scenarios, we will use an input mirror of reflectivity R 1 = 0.98, a realistic value given current coating technology, accounting for both the finite transmission and the light losses in the 4f imaging optics left of the sample plane. For the output mirror, we shall assume R 2 = 0.98(1 − T 2 ), either with variable transmission T 2 > 0 to control the output light in the CW and RD scenario, or with negligible T 2 0.01 to minimize roundtrip losses in the MP case.

A. Continuous wave cavity microscopy
Formally, the continuous wave (CW) scenario corresponds to the stationary limit of a constant fixedfrequency input power P in . In practice, it is achieved in the limit of very long input pulses, such that the constructive interference over all round trips can lead to an enhanced intra-cavity field. The CW description applies if in addition also fringe effects due to the initial buildup and the final decay of the cavity enhancement are small, i.e. the mean pulse duration τ far exceeds the inverse linewidth of the 8f cavity system. The time-integrated detection signal transmitted by the cavity and the energy absorbed by the sample can then be expressed in terms of the input power times a detection window. A further division by ω results in the respective photon numbers and in the dimensionless SNR evaluated below.
We expect a high sensitivity to the phase shift and absorption of a weak optical sample if the cavity is of high finesse, i.e. supports many roundtrips. The empty imaging cavity has its resonances where 16kf is a multiple of the wavelength λ, and it supports the mean number of . A non-reflective sample holder shifts the resonances to k(8f − d g ) = Kπ. When the input field is tuned close to or on resonance, any sample-induced phase shift or extinction will result in a sharp change of the transmitted output signal, according to the Lorentz function. This is nicely illustrated in the WS approximation, where we can expand the stationary output field (13) around resonance. Omitting the glass plate and considering the limit of highly reflective mirrors at odd order K, we obtain to lowest order This illustrates how the sample response is enhanced by the cavity finesse, i.e. the number of supported roundtrips if the input mirror is set to be almost perfect and other intra-cavity losses are neglected. Fig. 2 shows the SNR obtained for the detection of a graphene monolayer as a function of the effective cavity length (or detuning) and of T 2 . (a-d) show the results obtained in a bright-field (BF), dark-field (DF), negative (Znk-) and positive (Znk+) phase contrast detection scheme, respectively. For all these plots the damage was kept constant at about 26 absorbed photons, which is the damage that a short pulse of energy T 1 Q in = 1000 ω does in a single pass through the sample. The best SNR is found on cavity resonance, which gets more pronounced for lower T 2 , corresponding to a cavity of higher quality.
In Fig. 3 (a) we provide a horizontal cut through the previous diagrams at a fixed detuning and plot the SNR as a function of T 2 . For each of the four detection schemes, we chose the cavity length that supports the maximum SNR in Fig. 2. The graphene monolayer yields the highest SNR in a BF detection scheme (green), followed by DF (black), Znk-(blue) and Znk+ (red). The dotted lines represent the WS approximation based on the output field expression (13), which matches remarkably well even at low T 2 when the cavity supports many roundtrips.
For comparison, we list the SNR values for single-pass detection at the same damage level in Tab. I. A cavity with the mentioned specifications enhances the detection SNR by up to a factor of ten as compared to the optimal single pass microscopy technique. Even for T 2 = 1 the cavity simulations differ from these results due to light reflected from the specimen.
The results for a monolayer of BN are shown in Fig. 3  (b), keeping the damage level again fixed at the number of absorbed photons in a single interaction with a pulse energy T 1 Q in = 1000 ω. We use the same reference value for all detection schemes and in all the following. Since BN has a negligible imaginary component of the refractive index, the number of absorbed photons will now be  less than one. Hence for every T 2 in Fig. 3, the SNR is evaluated at about the same mean number of lightsample interactions, albeit at a varying damage level in each sample. Given the real index of refraction, the best SNR in the BN case (b) is obtained in phase contrast readout schemes. The BF detection scheme gives the worst SNR. For a range of values of T 2 around 0.04 also BF gives a considerable SNR mainly because the interaction with the stationary cavity field translates phase to amplitude contrast. For 20 monolayers of BN in Fig. 3 (c) the detection SNR is generally higher. However, the accumulated phase shifts can now be significant, which is why BF and DF detection can out-compete phase contrast detection schemes, and also why the SNR in positive phase contrast readout goes to zero for T 2 ∼ 0.23. Yet the WS approximation (dotted) still captures this behavior well. The above calculations were done for free sample layers without glass carrier plates, d g = 0. For non-reflective glass slabs, i.e. multiples of half-wavelengths in optical thickness, the cavity resonances are shifted, but the achievable SNR are similar. For direct comparison with Fig. 3, we show the results for a specimen carrier with n g = 1.5 and n g kd g = π in Fig. 4. We remark that for d g = 0 the best SNR values are obtained close to an odd cavity resonance, [8f mod λ] ≈ 0.5, whereas now they are slightly lower and situated at [(8f − d g ) mod λ] ≈ 0. The glass plate induces an effective phase shift 2kd g between the left-and right-running components that not only shifts the cavity resonance, but also modulates the reflections at the sample layer, as seen explicitly in (14). In the pulsed imaging schemes discussed below, this will mainly affect the sample-reflected pulses outcoupled after an even number of sample interactions, see (26).
A qualitatively different sample image would be observed if light were reflected by the carrier plate itself, i.e. for n g kd g = jπ. The weak response of the specimen would then be interlaced with the signature of the semitransparent carrier, which typically results in a lower SNR for weak samples. We do not discuss this regime here.

B. Cavity ring down microscopy
After the stationary scenario, where the light is allowed to interfere constructively in the imaging cavity, we now discuss the contrary regime of short, non-overlapping probe pulses. The straightforward way to enhance the sample signal by multiple sample interactions is a ringdown (RD) scheme [31,32]. A single input pulse of energy Q in and temporal width τ < 8f /c is sent through the imaging cavity, accumulating losses and phase shifts as it bounces between the mirrors. The resulting output field is an attenuating train of pulses spaced by δt = 8f /c, which can be either deposited as a cumulative signal in an integral detector or recorded individually in a timeresolved manner. We will first discuss the performance of the timeintegrated RD scheme. The experimentally more demanding time-resolved detection of individual pulses provides more options for signal analysis, and it can be seen as a serial implementation of the multi-pass (MP) scheme discussed in Sect. III C. Each subsequent pulse corresponds to an increasing number of sample passes, but only a small fraction of it is then transmitted through the cavity end mirror, T 2 1. In order to assess the performance of RD imaging with respect to a conventional single-pass image, we vary the effective number of passes by tuning the transmission T 2 of the exit mirror and again adjust the input pulse energy accordingly to keep the total sample damage accumulated over all passes fixed, m → ∞ in (19). The BF, DF, and Zernike detection signals are given by the sums of the individual pulses from m = 0 to ∞ in (22), (30), and (32), respectively. For the BF and Zernike signal, we also subtract an empty reference pixel; its output signal  Fig. 3, but with a sample holder of thickness ngkdg = π at ng = 1.5. Once again, we chose the optimal cavity length for BF (green), DF (black), Znk+ (red), and Znk-(blue). It is now at (8f − dg) mod λ ≈ 0. We compare the results for a graphene monolayer (a), a BN monolayer (b), and 20 BN monolayers (c). In (b), both Znk schemes give the same curve.
then contributes to the noise. In the WS limit (6), we can compare the performance of the RD and the CW scheme by looking at the scaling with the empty-cavity roundtrip number at high mirror reflectivities, ≈ 2/(1 − R 1 R 2 ). The accumulated sample damage (21) can be expressed as Q Notice that all the signals are four times smaller than their CW counterparts, while Q abs is only two times smaller. Hence, at equal damage in the WS and highreflectivity limit, the RD signals are by a factor of two worse than the CW signals and the SNR drops by 2 √ 2. This interferometric advantage is due to the coherent amplification of the intra-cavity field amplitudes that are transmitted and reflected by the sample in the CW case. Even in the limit of T 1 → 1, we find that the linear sample response differs by a factor of two between the CW output field (13) and the first output pulse in (14). The CW advantage comes with the experimental difficulty of having to stabilize the cavity at its resonance. Nevertheless, the SNR scaling with remains the same in both schemes.
The numerical results are plotted in Fig. 5 for the same samples and damage levels as in Fig. 3. While there are many similarities to the previous CW case there are also some striking differences: First, the achieved SNR is consistently smaller, as discussed before. Moreover, the WS approximation, based on linearized expressions for the absorbed and detected pulse energies, quickly ceases to be valid as T 2 decreases. For the BN samples in Fig.  5 (b) and (c), which are characterized by a real index of refraction, we notice that the time-integrated BF signal vanishes. In this case, the sample acts like a lossless beam splitter that redistributes the incident light energy over a trail of multiply reflected and transmitted pulses. Hence the time-integrated output energy is conserved and equal to the reference signal. This BF signal cancellation could be avoided with time-tagged detection, e.g. by using an avalanche photodiode array detector [33] flipping the sign of subsequent pulses in the post-processing stage. The same technique would also avert SNR cancellation in the negative phase contrast detection of 20 BN monolayers, which is reflected in the dip of the blue curve in Fig. 5(c).
The results were again evaluated without sample holder, d g = 0. Using a non-reflective glass plate as in Fig. 4 for the CW case, the achievable SNR values in RD imaging would also exhibit a slightly different T 2dependence and overall decrease. This is mainly due to the suppression of sample reflections, as seen explicitly in the WS limit. There, only the (weaker) even pulse orders (26) are affected by the sample holder, in proportion to 2 √ 1 − T 2 cos(2kd g )−T 2 . One thus obtains slightly better SNR values if kd g is zero or a multiple of π.

C. Multi-pass microscopy
In multi-pass (MP) imaging the goal is to limit and control the number of sample interactions of a short pulse entering a high-finesse imaging cavity. This can be achieved, for instance, by placing a fast outcoupling mechanism behind the sample plane that is locked to the input pulse timing and triggers after a delay corresponding to a selected number of full roundtrips. This allows one to choose between odd numbers of sample interactions, m = 2 + 1. Ideally, the outcoupling occurs at unit efficiency, while the imaging cavity should be of high finesse to minimize any sample-independent roundtrip losses. In practice, one can implement the outcoupling by means of a Pockels cell and a polarizing beam splitter. A Pockels cell is routinely incorporated in optical cavities and its losses can be neglected compared to realistic losses at lens interfaces. For convenience, we shall assume R 1 = R 2 = R, using R = 0.98 for the numerical examples. The empty reference pixel yields the signal Q ref = T R 2 Q in . (Note that one could also outcouple after even numbers of sample interactions, which captures the fraction of light reflected at the sample. The outcoupled light would not contain the bright background contribution Q ref , but rather resemble the DF image after full roundtrips.) Once again, we can estimate the explicit scaling of the SNR with the selected number of roundtrips in the WS limit. Here, the damage reduces to Q . The BF and Z signals follow from (25) and (33) after removing T 2 and subtracting the empty pixel, Dividing by the respective shot noise amplitudes leaves us with an SNR that grows with the number of passes like (1 − R)/(1 − R 2 +1 )R (2 + 1). For a moderate number of roundtrips, we can expand to lowest order in ε = 1 − R and find a square-root enhancement, SNR bf,df,Z ∝ √ 2 + 1. After many more roundtrips, the SNR decreases again exponentially with R ∼ e − ε . So we expect a sweet spot of maximum SNR at roughly ∼ 1/ε, provided the extinction at the weak sample is lower than the cavity loss and the accumulated phase shift (2 + 1)χ R π. The achievable SNR as a function of m is plotted in Fig. 6 (small dots). The damage levels were again chosen as in the CW and RD case. At that illumination intensity the single-pass SNR for detecting a monolayer of graphene or BN is below unity, irrespective of whether the sample is investigated a BF (green), DF (black), Znk+ (red) or Znk-(blue) microscope, see leftmost data points in Fig. 6 (a) and (b). Multiple passes initially increase the SNR, until losses eventually outweigh the gain in sensitivity offered by each additional pass. Due to the negligible loss in BN, its optimum sensitivity is reached after a higher number of roundtrips than for graphene. Given that all the light can be outcoupled after the optimal number of interactions, it might be surprising that the achievable SNR are very close to the ones observed in Fig. 5 for the RD case. This is because the light that underwent an odd number of reflections from (i.e. even number m of interactions with) the sample forms a counter-propagating pulse that is not outcoupled to the detector. It thus neither cancels the BF signals, as observed in a RD scheme for the BN samples, nor does it contribute to the DF or phase-contrast signals. The performance of a MP scheme can potentially be improved by adjusting the timing window of the detection to also include the counter-propagating pulse. this thicker sample is higher and shows oscillations as the phase shifts accumulate. The WS approximation (bright circles) is valid for a several tens of passes in (a) and (b); it does not hold for the 20 monolayers thick sample and is therefore omitted in (c). We note that samples of sub-diffraction limited dimensions would represent even weaker samples, for which the WS limit might be appropriate also at high interaction numbers.
In the case of a finite carrier plate, n g kd g = π as studied in Sect. III A, we expect (and numerical simulations confirm) no significant change in the SNR associated to odd interaction orders m that are plotted for the single monolayers in Fig. 6 (a) and (b). The behavior does change for 20 BN monolayers (c) where the WS approximation is no more valid. We plot the results with sample carrier in Fig. 7.
Note that the picture changes completely if we consider reflective sample holders, n g kd g = jπ. Light that has undergone multiple reflections and transmissions at the sample would then be redistributed over several pulses by the carrier plate, which acts as a semi-transparent mirror. No clear distinction between even and odd m could be made in terms of the sample response any longer.

IV. CONCLUSIONS
It is a well established result in the quantum description of phase estimation [10][11][12][13] that multiple interactions between a probe particle and a specimen can lead to an enhanced measurement precision beyond the shotnoise limit. The same enhancement can be observed in optical wide-field microscopy, using self-imaging cavities to amplify the phase and amplitude contrast of optically thin, weak samples. Here we have assessed in detail how the signal-to-noise ratio increases with the number of light-sample interactions. Our results not only apply to the recently demonstrated pulsed multipass schemes with a controlled number of light-sample interactions [4,15], but also to ring-down and continuous-wave schemes where the enhanced interaction strength is set by the finesse of the self-imaging cavity.
Indeed, we find that in the limit of weak samples and loss-less cavities, the sensitivity of appropriate (optimal) detection schemes always grows with the square root of the mean number of light-sample interactions-in agreement with results from quantum measurement theory [11]. When the detection scheme is not ideal with respect to the properties of the specimen, the sensitivity enhancement can even surpass the square-root scaling, as we have shown for bright-field images of non-absorbing phase shifters, for instance. This might be a desireable feature if the detection technique can not be altered due to other experimental constraints, or if the sample properties are completely unknown in the first place.
In the presented case studies, the detection of a few atomic layers of graphene or boron nitride, all cavityenhanced microscopy techniques could clearly outperform a conventional single-pass image in terms of signalto-noise at a fixed number of photon-specimen interactions. We obtained the best results with the continuouswave approach, which profits from the interference between co-and counter-propagating components of the intra-cavity field that does not occur in a pulsed multipass or ring-down scheme. The resonance behavior also leads to a conversion from phase contrast to amplitude contrast, such that a pure phase sample can be detected using a bright-field detection scheme. However, the sensitivity enhancement requires the imaging cavity to be fine-tuned to its sample-dependent resonance and is thus more challenging to realize in an experiment.
In the pulsed multipass and ring-down schemes, co-and counter-propagating light pulses will reach the detector at different times. Even though no interference takes place between those pulses, the phase or amplitude signal of the sample can still be amplified, de-amplified, or cancelled, depending on the chosen imaging technique. For further improving the signal-to-noise ratio, one can employ time-gated detection (e.g. using an avalanche photodiode array detector [33]), or consider ring resonators that outcouple co-and counter-propagating pulses in different directions and onto separate detectors.
While the analysis was done for scalar fields, it was shown that polarization-sensitive measurement techniques are also enhanced by multi-passing [4], with potential applications such as ellipsometry. Further numerical studies may give more insights to the diffractionlimited detection of sub-wavelength samples such as nano-structured materials or biological specimen. First results were only obtained for continuous-wave imaging via the transverse modes of a resonant, degenerate cavity [34]. It remains to be investigated how the less stringent requirements of a pulsed detection scheme would affect the signal, and possibly the resolution and depth of field obtained when imaging sub-wavelength structures.