Performance of serial time-encoded amplified microscope

: Serial time-encoded amplified microscopy (STEAM) is an entirely new imaging modality that enables ultrafast continuous real-time imaging with high sensitivity. By means of optical image amplification, STEAM overcomes the fundamental tradeoff between sensitivity and speed that affects virtually all optical imaging systems. Unlike the conventional microscope systems, the performance of STEAM depends not only on the lenses, but also on the properties of other components that are unique to STEAM, namely the spatial disperser, the group velocity dispersion element, and the back-end electronic digitizer. In this paper, we present an analysis that shows how these considerations affect the spatial resolution, and how they create a trade-off between the number of pixels and the frame rate of the STEAM imager. We also quantify how STEAM’s optical image amplification feature improves the imaging sensitivity. These analyses not only provide valuable insight into the operation of STEAM technology but also serve as a blue print for implementation and optimization of this new imaging technology.


Introduction
As a bread-and-butter diagnostic tool in biomedicine, optical microscopy has fueled spectacular progress in unraveling the complexity of the physiological processes in biological tissues and cells [1][2][3][4]. In particular, high-speed optical microscopy has been of great importance to study dynamical processes, especially non-repetitive transient phenomena.
Examples are (i) the spatiotemporal study of biochemical waves in cells/tissues, which requires imaging with a µsto ns-response time [5,6], and (ii) flow cytometry where highspeed imagers are required to provide high-throughput cell characterization [7].
However, it has been challenging for conventional optical microscopy to capture fast dynamical processes with high sensitivity and temporal resolution. This is mainly due to the fundamental trade-off between sensitivity and frame rate that appears in standard chargecoupled devices (CCDs) and their complementary metal oxide semiconductor (CMOS) counterparts -the workhorse in microscopy [1,8,9]. Other drawbacks of fast CCD/CMOS imagers are the requirement of cooling to reduce thermal noise, which adds complexity and cost of refrigeration; and the need for high-intensity illumination to ensure adequate signal-tonoise (SNR) ratio, which causes damage to, or modification of, the object being imaged.
We recently developed an entirely new imaging modality called serial time-encoded amplified microscopy (STEAM) [10,11]. It overcomes the limitations of CCD/CMOS imagers and offers a few orders of magnitude higher frame rates (more than 5 MHz) than these imagers. It encodes the image onto the spectrum of a laser pulse and then converts the spectrum into an optically-amplified time-domain serial data. This allows us to capture images with a single-pixel detector, eliminating the need for the CCD/CMOS imagers and their associated trade-off between imaging sensitivity and speed by optical image amplification.Unlike the conventional microscopes in which the lens systems and CCD/CMOS imagers in general dictate the imaging quality, the performance of the STEAM system depends not only on the lens system, but also on the characteristics of the mapping processes which convert the spatial information to the serial temporal data. As a result, evaluating the final STEAM image quality requires careful considerations of the properties of the individual elements involved in these processes. In this paper, we present an analysis of STEAM performance to show how these considerations uniquely establish the limit to spatial resolution, and create the trade-off between the number of pixels and the frame rate. More importantly, we will also describe the impact of optical image amplification on image sensitivity.

Working principle of STEAM
The key feature of STEAM is the mapping of an image into a serial time-domain waveform by a two-step approach (Fig. 1): (i) space-frequency mapping -the spatial information of an object is first encoded onto the spectrum of a broadband pulse by using a spatial disperser [10,11]. (ii) Frequency-time mapping -A dispersive element (e.g., a dispersive fiber) is then used to perform a process called amplified dispersive Fourier transform (ADFT) which maps the spectrum of an optical pulse into a temporal waveform using group-velocity dispersion (GVD) [12][13][14][15]. The optical spectrum, which is encoded with the image, now appears as a serial sequence in time. To simultaneously amplify the image, the dispersive fiber is pumped with secondary light sources to implement optical amplification directly within fiber. This powerful approach compensates for the inherent loss associated with GVD [10][11][12][13][14][15] and brings the signal above the thermal noise of the photodetector -enabling high speed imaging at low light levels. Due to the optical image serialization, the image can be detected with a single-pixel photodiode and captured, not by a CCD/CMOS camera, but instead, with any real-time digitizer.
STEAM can either perform one-dimensional (1-D) or two-dimensional (2-D) imaging, called 1-D STEAM or 2-D STEAM, respectively. 1-D STEAM utilizes a 1-D spatial disperser such as a prism, or a diffraction grating to generate a 1-D spectral pattern in space for illumination, resembling a 1-D spectral shower [10,[16][17][18]. The 1-D spatial information is then encoded onto the back-reflected 1-D spectral shower which is subsequently converted into a serial temporal waveform by ADFT. In contrast, 2-D STEAM uses a 2-D spatial disperser which comprises of two orthogonally oriented 1-D spatial dispersers to generate a 2-D spectral shower onto which the 2-D spatial information is encoded. The subsequent frequency-time mapping process is identical to 1-D STEAM. The analysis of the STEAM performance, in the context of both 1-D and 2-D STEAM, is of main focus of this paper.

Spatial resolution of STEAM
The actual spatial resolution of STEAM is not solely determined by diffraction limit, as in the case of typical confocal microscopy [1]. It can also be affected during the space-frequency and frequency-time mapping processes. Specifically, it can be governed by (i) the spectral resolution of the spatial disperser (spatial-dispersion-limited regime), (ii) the spectral resolution imposed by ADFT through stationary-phase-approximation (SPA) (SPA-limited regime) and (iii) the temporal resolution of the digitizer (digitizer-limited regime). These limiting factors will be discussed in detail in the context of 1-D and 2-D STEAM below.

1-D STEAM
In 1-D STEAM, the space-frequency mapping is accomplished by using a 1-D spatial disperser to first generate a 1-D spectral shower [10,[16][17][18]. Consider the use of a diffraction grating as the 1-D spatial disperser in 1-D STEAM as shown in Fig. 2, its spectral resolution is well-known to be (assuming first-order diffraction) cos , where θ g is the diffracted angle under Littrow's condition [16], λ is the center wavelength, d is the grating period, and W is the input beam waist. When the spatial resolution is governed by the spatial dispersion of the grating, it is said to be spatial-dispersion limited and is given by, where f is the focal length of the objective lens, C x represents the conversion factor between the space and wavelength, and dθ g /dλ = 1/ d cos(θ g ) is the angular dispersion of the diffraction grating. In practice, the 1-D spectral shower beam underfills the aperture of the objective lens in order for the objective lens to capture the whole spectrum. It implies that the spatialdispersion-limited spatial resolution would be worse than the diffraction-limited resolution. On the other hand, the ADFT process should ideally perform one-to-one frequency -time mapping, i.e. only one wavelength contributes to the temporal waveform at any one time instant. However, there is always an ambiguity in this mapping process -a fundamental property of the dispersive Fourier transform [15]. Such an ambiguity, defined by SPA, can be used as a measure of the spectral resolution of the ADFT process (δλ SPA ) [15]: where D is the GVD and c is the speed of light. Note that in Eq. (3), we neglect the higherorder dispersions which result in nonlinear frequency-time mapping. Nevertheless, it is adequate here to illustrate the primary contribution of ADFT to the spatial resolution of STEAM. More detailed analysis including higher-order dispersion can be referred to ref [15]. Similar to Eq. (2), the spectral resolution δλ SPA can be translated to the SPA-limited spatial resolution because of the space-frequency mapping. It can be written as Moreover, the temporal resolution (finite bandwidth) of the optical detection system in STEAM (i.e., photodetector, and electronic digitizer) can also be a limiting factor of the spatial resolution of STEAM. It can be understood by recognizing the temporal resolution of the digitizer imposes an equivalent spectral resolution via ADFT, which is given by [15], where f det is the bandwidth of the detection system (i.e., the photodetector and digitizer). Hence, the digitizer-limited spatial resolution can be expressed as The contribution of each of the above three limiting factors [i.e. Equations (2), (4) and (6)] to the actual spatial resolution of 1-D STEAM (δx 1D ) is exemplified in Fig. 3(a). The values of the parameters used in this example are detailed in the figure caption. We observe that the spatial resolution is digitizer-limited (i.e.
Increasing GVD can bring the system into SPA-limited (i.e. 1 1 The corresponding temporal resolution, which is given by Fig. 3(b). It is ideal to have the STEAM system whose temporal resolution is limited by the available GVD, not the bandwidth of the digitizer, while keeping the best achievable spatial resolution [δx 1D ~0.5µm in Fig. 3(a)]. In order to meet this criterion, the favorable operating regime should be ~0.1 -1 ns/nm in this example [see Fig. 3

2-D STEAM
Here we require a 2-D spatial disperser to map the spectrum into a 2-D space, generating a 2-D spectral shower on which the spatial information is encoded. The 2-D spatial disperser consists of two spatial dispersers: a diffraction grating and a virtually-imaged phased array (VIPA) [19]. Their dispersion directions are orthogonal to each other. The key feature of the VIPA is that wavelengths differed by integer multiples of its free spectral range (FSR) are spatially overlapped with each other in one dimension. A diffraction grating is used to remove this spatial degeneracy in the orthogonal dimension, resulting in a 2-D spectral shower (Fig. 4). Such a grating-VIPA arrangement has previously been used for spectroscopy [20] and for wavelength de-multiplexing in telecommunication applications [21]. In contrast, here we employ this 2D spatial disperser for the purpose of imaging [11,22].
Following the 1-D analysis, we note that the spatial resolution of 2-D STEAM is also governed by the three aforementioned limiting regimes (i.e., the spatial-dispersion-limited, SPA-limited and digitizer-limited regimes.) Moreover, the spatial-dispersion-limited spatial resolutions in the two orthogonal dimensions are different because of the different dispersive properties of the VIPA and the diffraction grating. To estimate these spatial resolutions, we need to recognize different effects which limit the spatial resolution in the x-and y-directions.

Spatial-dispersion-limited spatial resolution in the y-direction
The spectral resolution of the VIPA, namely its finite spectral linewidth, is the key factor determining the spatial-dispersion-limited spatial resolution in the y-direction in 2-D STEAM. This linewidth (δλ VIPA ) is defined by the full-width half-maximum (FWHM) of the transmission resonance spectrum of the VIPA, which is given by Ref [19].
where R 1 and R 2 are the reflectivities of the VIPA's front and back mirrors, respectively. θ in is the angle related to the VIPA's tilt angle θ VIPA by sin(θ VIPA ) = n·sin(θ in ), where n is the refractive index of the VIPA. t is the thickness of the VIPA. k is the wavenumber, k = 2πn/λ. In Eq. (7), N is the number of VIPA's "virtual sources" contributed to the actual spectral shower and it is an important parameter determining the spectral resolution of the VIPA. It can be approximated as N≈L/2t·tan(θ in ), where L is the aperture size of the VIPA. However, if the output beam from the VIPA overfills the back aperture of the objective lens (with a size of D), the objective lens is essentially unable captures all the "virtual sources". In this case, the N is rather limited by the aperture size of the objective lens, i.e. N ≈D/2t·tan(θ in ). For example, with L = 1 cm, θ in = 3°, t = 1.5 mm, and D = 5 mm, we can only obtain N ≈25. The consequence of such finite N is linewidth broadening [see Eq. (7)], which in turns limits the spatial resolution in the y-direction ( 2 spatial D y δ ) through space-frequency mapping. This finite-N effect is analogous to the diffraction grating in which the number of illuminated groove lines defines the spectral resolution. Similar to Eq. (2), 2 spatial D y δ is given by, where C y is the conversion factor between space in the y-direction and the wavelength, and dθ VIPA /dλ ≈-2[n 2 -sin 2 θ VIPA ]/[λsin(2θ VIPA )] is the angular dispersion of the VIPA [19]. Here, we ignore the higher-order dispersion effect of the VIPA which is only significant when large diffracted angle from the VIPA is considered [19].
The finite-N effect on 2 spatial D y δ is experimentally verified as shown in Fig. 5. Figure 5(a) shows the spectral shower generated using a broadband pulsed beam. Another continuouswave (CW) laser beam is also coupled together and appeared as a single spot on top of the spectral shower.

Spatial-dispersion-limited spatial resolution in the x-direction
From Fig. 5(a), it is conceivable that the spatial-dispersion-limited spatial resolution in the xdirection ( 2 spatial D x δ ) can be defined as one column separation of the spectral shower ( 2 FSR D x δ ), which corresponds to one FSR of the VIPA in wavelength (∆λ FSR ). This relation can be written as 2 2 , where ∆λ FSR ~λ 2 /2ntcos(θ in ). Consider the example shown in Fig. 5(a), then we can estimate 2 spatial D x δ~8 µm using Eq. (9). It matches reasonably well the column separation shown in Fig. 5(a). It is also in good agreement with the recent demonstration of simultaneous mechanical-scan-free microscopy and laser surgery using similar 2-D spectral shower [22].
In contrast, the actual spatial resolution in the x-direction (δx 2D ) does not depend on the properties of ADFT and the digitizer, and hence is only affected by the spatial-dispersion properties by the 2-D spatial disperser. Therefore, 2 2 .
The dependence of actual spatial resolution in the y-direction (δy 2D ) on GVD follows the similar trend shown in the 1-D case [compare Fig. 7(a) and 2(a)]. The system transits from digitizer-limited, SPA-limited to spatial-dispersion limited operation with increasing GVD. Same trend can also be observed in temporal resolution, which is given by δt 2D = D· δy 2D /C y [ Fig. 7(b)]. In this example, the 2-D STEAM system is preferable to operate at ~5 -6ns/nm in order to make the temporal resolution of the system to be limited only by the available GVD while keeping the best achievable spatial resolution, i.e. δy 2D ~0.7 µm. It should be reminded such a large GVD can readily be made possible by the mean of optical amplification in STEAM, which overcomes the inherent loss associated with GVD. An extraordinarily large dispersion of > 10ns/nm has been demonstrated in ADFT-based spectroscopy with ultra-high spectral resolution [13].

Field-of-view, number of pixels and imaging frame rate
The field-of-view (FOV) of 1-D STEAM scales with the bandwidth of the spectral shower (∆λ SS ). Based on Eq. (2), the 1-D FOV is given by ∆x 1D = C x ·∆λ SS . On the other hand, the FOV of 2-D STEAM is set by spectral shower bandwidth and the FSR of the VIPA in the xand y-direction, respectively. It can thus be estimated as ∆x 2D = C x ·∆λ SS , and ∆y 2D = C y ·∆λ FSR . Note that with the other parameters intact, the aspect ratio of the FOV can be tuned by varying ∆λ SS and ∆λ FSR . Using the parameters shown in Fig. 5, we find ∆y 2D ~80 µm which agrees well with the measurement [ Fig. 5(a)], whereas ∆x 2D ~200 µm with ∆λ SS = 15 nm. Another important parameter is the total number of pixels. It is equivalent to the number of data points sampled within each image-encoded temporal pulse after ADFT (i.e. single image frame). In 1-D STEAM, the temporal width of each pulse is set by spectral shower bandwidth (∆λ SS ) through ADFT, which is given by ∆λ SS ·D, the total number of pixels in 1-D STEAM (N 1D ) can be written as where to f dig is the sampling rate of the digitizer. In 2-D STEAM, the numbers of pixels along the two orthogonal dimensions are defined in differently. In the x-direction, the number of pixels (N 2D-x ) is essentially the number of columns in the spectral shower. In contrast, the number of pixels in the y-direction (N 2D-y ) is the number of sampled point by the digitizer along each column of the spectral shower. Hence, Note that because of the serialization, only N 2D-y depends on the GVD, and the sampling rate. The total number of pixels N 2D is thus given by, where f rep is the repetition rate of the pulse laser, and hence is the frame rate of STEAM. From Eq. (19), we note that increasing GVD or the optical bandwidth can increase the frame length in time and hence the number of pixels. However, it comes at an expense of the laser repetition rate (i.e., the frame rate) in order to avoid the overlap between consecutive frames (i.e., ∆λ SS ·D < f rep −1 ). Fortunately, this limitation can be overcome by using a technique called virtual time gating, based on wavelength division multiplexing, to carve the single frame into multiple bands [23]. In such a parallel architecture, the number of pixels can be increased by the number of parallel channels (M) without sacrificing the frame rate (see Fig. 8), 1 2 .
As shown in Eqs. (19) and (20), the number of pixels can be scaled up by increasing the optical bandwidth, GVD, the digitizer sampling rate and the number of parallel channels for implementing virtual time gating. For example, it is practical with today's technology to achieve ~300,000 pixels for a frame rate of 1 MHz using an optical bandwidth of ~150 nm, a dispersion of ~5 ns/nm, a digitizer sampling rate of 50 GS/s, using eight parallel virtual time gating channels (M = 8) (Fig. 8).

Detection sensitivity
The detection sensitivity of STEAM is typically limited by a number of noise sources, namely the inherent shot noise of the input light (N shot ), the dark current noise (N dark ) and the thermal noise (N thermal ) of the photodetector. The shot noise is given by N shot = S in -1 /2 , where S in is the number of signal photoelectrons collected by the photodetector. Consider an optical amplification process with a gain of G and a noise figure of F, the resultant shot noise at the photodetector becomes G ·F ·N shot [24]. Hence, the total noise of the system is ( ) Figure 9(a) shows the individual noise component in the system as a function of the number of signal photon per pixel, which is defined as Φ /f dig , where Φ is the signal photon flux. For a system without gain (i.e. G = 1), it is shot-noise-limited when the signal photon number > ~1000. Decrease in signal photon number (< ~1000) however would turn the system into thermal-noise-limited (N dark << N thermal at room temperature). In contrast, the system becomes shot-noise-limited in the presence of optical amplification. The merit of optical amplification becomes more apparent by investigating the SNR of the STEAM system. With an optically amplified signal, i.e. G ·S in , the SNR is given by where S in = η ·Φ /f dig , where η is the quantum efficiency of the photodetector. Equation (22) shows that optical amplification effectively reduces both the dark current noise and the thermal noise by a factor of G, albeit the increase in shot noise by a factor of F. Hence, the SNR can be significantly enhanced if we implement the optical amplification with high gain and low noise- figure. Here, we employ distributed Raman amplification (DRA) within the dispersive fiber because DRA is well-known to have widely tunable and broadband gain spectrum, low noise figure, and the ability to maximize the SNR-to-distortion ratio by keeping the signal power away from low power (noisy) and high power (nonlinear) regimes because of its distributed amplification nature [10][11][12][13][14]. The improvement in SNR as a result of optical amplification is depicted in Fig. 9(b). The optical amplification considerably enhances the SNR especially for the low light regime (i.e. < ~1000 photons per pixel) in which the system is originally thermal-noise-limited. Fig. 9. (a) Noise components of STEAM: Dark current noise (black), thermal noise (blue), shot noise with gain, G = 30dB (red solid line) and shot noise without gain (red dashed line). (b) SNR of the system with gain, G = 30dB (red), and without gain (blue). The system parameters are: a photodetector with a bandwidth of 40 GHz, dark current of 100 nA, noise equivalent noise of 50pW/Hz 1/2 and η = 0.8. The fdig = 50GS/s. the wavelength is 800 nm. Detailed calculation of Ndark and Nthermal can be referred to Ref [24]. We assume DRA is employed within the dispersive fiber and a noise figure of ~3.5dB [25]. The dispersive fiber loss is assumed to be 20 dB. The black dashed line in (b) represents the theoretical shot noise limited SNR. Furthermore, the system shows the minimum detectable number of signal photons is as low as ~7 for SNR = 1. It corresponds to an input referred noise of ~-150 dBm/Hz at λ = 800 nm. It should be emphasized again such detection sensitivity improvement enables the ultrafast imaging, on the order of MHz, in STEAM.

Concluding remarks
In conclusion, we have presented a model that quantifies the spatial and temporal resolution of the recently demonstrated STEAM imaging technology. We have also quantified the imager's detection sensitivity. We have shown that the spatial resolution is not simply governed by the diffraction limit, but it also depends on the spectral resolutions imposed by (1) the spatial disperser, (2) the amplified dispersive Fourier Transform (ADFT), and the digitizer's sampling speed and input bandwidth. We have also shown that there is a trade-off between the number of pixels and the frame rate as a consequence of the image serialization. However, this can be remedied by a technique known as virtual time gating; an all-optical time demultiplexer that enables parallel detection. This analysis not only provides valuable insight into this new imaging system, it also serves as a tool for design and optimization of STEAM imaging systems.
Real-time, continuous and ultrafast operation of STEAM naturally meets the demands of many high-speed imaging applications which involve detecting fast events that are very rare, rogue events. Both the 1-D and 2-D STEAM configurations can find a myriad of applications in these areas. 2-D STEAM in its native form could play a significant role in non-invasive high-speed reflectance confocal imaging, which has been employed in clinical applications, e.g. to monitor fast response to therapeutic treatment such as laser surgery [26,27].
In contrast, having a simpler spatial disperser implementation and less stringent requirement on temporal dispersion (see Fig. 3 and 7), 1-D STEAM could find a compelling application in high-speed and sensitive imaging flow-cytometry [28]. In this case, the 1-D STEAM imager, operating in the line scan mode, is able to reconstruct the 2-D cross sectional images of the cells as they flow in the microfluidic channel at a high flow rate. Current flow cytometers have no means to offer real-time and high-speed imaging that matches the throughput of state-of-the-art flow cytometers (i.e. up to 100,000 cells per second [28]). This is mostly because of the lack of a camera technology with sufficient combination of speed and sensitivity. Combined with flow cytometry, 1-D STEAM provides an attractive means of delivering real-time imaging of cells and potentially performing screening of rare cancer cells such as circulating tumor cells [29].