Achieving the Shot-noise Limit Using Experimental Multi-shot Digital Holography Data

In this paper, we achieve the shot-noise limit using straightforward image-postprocessing techniques with experimental multi-shot digital holography data (i.e., off-axis data composed of multiple noise and speckle realizations). First, we quantify the effects of frame subtraction (of the mean reference-only frame and the mean signal-only frame from the digitalhologram frames), which boosts the signal-to-noise ratio (SNR) of the baseline dataset with a gain of 2.4 dB. Next, we quantify the effects of frame averaging, both with and without the frame subtraction. We show that even though the frame averaging boosts the SNR by itself, the frame subtraction and the stability of the digital-hologram fringes are necessary to achieve the shot-noise limit. Overall, we boost the SNR of the baseline dataset with a gain of 8.1 dB, which is the gain needed to achieve the shot-noise limit.

With the above details in mind, Spencer recently used a scalar formulation (and the assumptions therein) to develop closed-form expressions for the SNR, S/N, associated with off-axis and on-axis recording geometries often used when performing digital holography [30]. For all intents and purposes, these closed-form expressions took the following form: where η t is the total-system efficiency, α is a recording-geometry constant, m S and m R are, respectively, the mean number of signal and reference photoelectrons (assuming Poisson statistics), and σ 2 n is the total-noise variance associated with the focal-plane array (FPA) read out integrated circuitry (assuming Gaussian statistics). With the use of a strong reference, m R ≫ m s and m R ≫ σ 2 n . As such, we can approach a shot-noise-limited detection regime, such that In writing Eqs. (1) and (2), one must acknowledge that the number of hologram photoelectrons, m H = m S + m R , can not exceed the pixel-well depth of the FPA, as that would lead to camerasaturation effects. This last point leads to an interesting trade space using modern-day cameras, and recent modeling and simulations efforts validated the use of these closed-form expressions [31,32]. In particular, these analyses made use of wave-optics simulations, assuming an ideal totalsystem efficiency (i.e., η t = 100%), and showed that one is not guaranteed a shot-noise-limited detection regime if the pixel-well depth is on the order of σ 2 n . Independent of being in a shot-noise-limited detection regime, recent laboratory experiments also showed that efficiency losses further limit the achievable SNR [33][34][35]. With Eqs. (1) and (2) in mind, these experiments showed that one can decompose the total-system efficiency, η t , into independent multiplicative terms, which represent the various physical phenomena that induce efficiency losses. These efficiency losses degrade the achievable SNR and are quantifiable with the appropriate digital-holography datasets and image-post-processing techniques.
Another laboratory experiment recently showed that given multi-shot digital holography data (in this case, off-axis data composed of multiple noise and scintillation realizations), one can use straight forward post processing techniques like frame subtraction and frame averaging to boost the SNRs associated with their digital-holography datasets [28]. This experiment, however, did not attempt to quantify these SNR boosts in terms of efficiency losses. In turn, we realized that such an analysis could have distinct benefits for other laboratory experiments, like those that use digital-holographic microscopy [36]. We also realized that such an analysis could have distinct benefits for field applications like long-range imaging, as previously mentioned, in addition to imaging through fog [37].
These aforementioned realizations provided the motivation needed to perform the digitalholography research presented in this paper. Put simply, we wanted to quantify the effects of straightforward image-post-processing techniques in terms of the efficiency losses that degrade the achievable SNR. In turn, we discovered that we can use frame subtraction and frame averaging, along with multi-shot digital holography data (in this case, off-axis data composed of multiple noise and speckle realizations), to achieve the shot-noise limit. Given the detailed analysis presented herein, this discovery serves as a novel contribution to the digital-holography research community. With this novelty statement in mind, it is important to note that past research efforts have claimed to achieve the shot-noise limit [38], but their definition for what this fundamental limit entails differs from the detailed analysis presented herein.
In what follows, we define the shot-noise limit (for the experimental multi-shot digital holography data referred to throughout this paper) as the gain needed to boost the SNR, such that it equals the closed-form expression given in Eq. (2) with an ideal total-system efficiency (i.e., η t = 100%). In Section 2, we simply refer to this shot-noise-limited SNR as the ideal SNR [cf. Equation (3)]. We also provide the background details needed to understand the experimental setup, SNR calculations, and efficiency calculations used to achieve the shot-noise limit. In Section 3, we then quantify the effects of frame subtraction, and in Section 4 we quantify the effects of frame averaging, both with and without the frame subtraction. Thereafter, we conclude this paper in Section 5, and we include an appendix that shows that frame subtraction is a necessary first step to achieve the shot-noise limit.

Background details
In this section, we discuss the background details associated with the experimental setup used to collect the various digital-holography datasets referred to throughout this paper. We also discuss the background details associated with the SNR and efficiency calculations. Previous efforts made use of similar setups and calculations to investigate the various efficiency losses that degrade the achievable SNR [33][34][35]. These previous efforts, in addition to the recent work of Radosevich et al. [28], provide the insights needed to develop the straightforward image-post-processing techniques presented in this paper to achieve the shot-noise limit.

Experimental setup
We collected the various digital-holography datasets referred to throughout this paper, like the baseline dataset illustrated in Fig. 1, in the off-axis image plane recording geometry (IPRG) [25,31]. For this purpose, we started with a continuous-wave, master-oscillator (MO) laser (Cobalt Samba 1000) with a wavelength of 532 nm and a linewidth less than 1 MHz. We then split the light from the MO laser into a local oscillator (LO) and an illuminator using a half-wave plate and polarizing beam splitter (PBS) cube. For the LO, we fiber coupled the light split off from the PBS cube and placed the tip of the single-mode, polarization-maintaining fiber next to an imaging lens. The diverging light from the fiber tip illuminated a 2048 × 1536 pixel region of interest on the focal-plane array (FPA) of the camera (Point Grey Grasshopper3 GS3-U3-32S4M-C) to create a reference. As shown in Fig. 1 (a), the FPA's coverglass produced an etalon-interference pattern, which yielded a non-uniform reference.
For the illuminator, we expanded the near-Gaussian beam to a diameter of approximately 4 cm and illuminated a sheet of Spectralon. By design, the Spectralon provided an optically rough surface with 99% reflectivity and near-Lambertain scattering, which produced speckle. We imaged this speckle with a one-inch-diameter lens onto the FPA to create a signal [see Fig. 1 In accordance with the off-axis IPRG [25,31], we placed the imaging lens, with a focal length of 350 mm, 2.46 m away from the Spectralon. Overview of the baseline dataset used in this paper. The top row depicts the average frames, whereas the bottom row depicts the corresponding average Fourier-plane energies, where the camera-integration time and optical-path-length differences were t i = 100 µs and ∆ℓ = 0 m, respectively. Here, (a) shows the mean reference-only frame m R (x, y), (b) shows the mean signal-only frame m S (x, y), (c) shows the mean digital-hologram frame m H (x, y), (d To achieve a strong reference, the reference strength was set to approximately 25% of the FPA's pixel-well depth, such that m R ≈ 2, 676 pe, where m R is again the mean number of reference photoelectrons. We then set the signal strength to m S ≈ 71 pe, where m S is again the mean number of signal photoelectrons. As shown in Fig. 1 (c), the resulting digital hologram maintained aspects of the near-Gaussian speckle pattern due to the signal and the etalon-interference pattern due to the reference.
In this paper, we used four datasets with a combination of two different camera-integration times, t i = 100 µs and 100 ms, and two different optical-path-length differences between the signal and reference, ∆ℓ = 0 m and 247.5 m. We created the optical-path-length differences by inserting an additional 165 m length of fiber, with a refractive index of 1.5, in the reference path, relative to the fixed signal path. Each dataset contains a series of 200 digital-hologram frames, 200 signal-only frames, and 200 reference-only frames.
For the 200 digital-hologram and signal-only frames, we collected 10 speckle realizations by rotating the Spectralon to illuminate a completely different portion of the optically rough surface. To average the shot noise, we collected 20 digital-hologram, signal-only, and reference-only frames sequentially for each speckle realization. Our experimental procedure, overall, consisted of collecting 20 digital-hologram, 20 signal-only, and 20 reference-only frames, then we rotated the Spectralon and repeated this process 10 times. Thus, in Fig. 1 we show the average of the 200 reference-only frames, signal-only frames, and digital-hologram frames, respectively, in the top row and their corresponding average energies in the Fourier plane in the bottom row.
As shown in Fig. 1 (a) and (c), the etalon-interference pattern due to the reference produced two main sets of fringes. The approximate periodicity of both fringe sets corresponded to low-spatial-frequency features in the Fourier plane, as seen Fig. 1 (d) and (f), respectively. Fortunately, these low-spatial-frequency features are outside of the pupil filter and did not considerably contribute to the sampled noise in the Fourier plane. However, as we show in the ensuing analysis, the non-uniform reference can yield excess noise above the reference shot noise [33].
In accordance with the off-axis IPRG [25,31], we had an image-plane sampling quotient, q I , of 2.7 and a circular pupil approximately centered in the top-right quadrant of the Fourier plane [see Fig. 1 (f)]. As a reminder, q I represents the number of pupil diameters across the Fourier plane. The autocorrelation of the signal created a strong, DC-centered feature in the Fourier plane. This feature was approximately conical, as described by the chat function [39,40], with a diameter of twice the pupil in the Fourier plane. With a q I = 2.7, the pupil filter sampled a significant portion of this chat-like feature from the autocorrelation of the signal in the Fourier plane [see Fig. 1 (e)]. As previously explained [33], this sampling of the chat-like feature yields excess noise due to the signal that increases quadratically with signal strength.

Signal-to-noise ratio calculations
With the shot-noise limit in mind, we derived a closed-form expression for the ideal SNR, S/N i . To do so, we assumed a uniform and strong reference, such that the dominant noise was the reference shot noise. Thus, for the off-axis IPRG [25,31], we obtained the following closed-form expression [cf. Equation (2), where η t = 100%]: where again, q I is the image-plane sampling quotient and m S is the mean number of signal photoelectrons. With q I = 2.7 and m S = 71 pe, S/N i = 661 for the baseline dataset (cf. Fig. 1).
To estimate the SNR from the collected digital-hologram frames, we used the following calculation in the Fourier plane: where S/N ′ is the estimated SNR, E H (︁ f x , f y )︁ is the mean hologram energy (i.e., the magnitude squared of the complex data), E N (︁ f x , f y )︁ is the mean noise energy, and ⟨·⟩ P denotes a spatial average over the pupil filter in the Fourier plane. To estimate E N (︁ f x , f y )︁ , we assumed that the noise in the Fourier plane was symmetric about the y-axis, so that This assumption was appropriate, since the reference did not show any noticeable features within the pupil filter nor in the adjacent quadrant, and the chat-like feature was approximately radially symmetric [cf. Fig. 1 (b)].

Efficiency calculations
From Eqs. (3) and (4), we calculated the estimated total-system efficiency, η ′ t , as This calculation quantifies how much the estimated SNR, S/N ′ [cf. Equation (4)], is below the ideal, shot-noise limited SNR, S/N i [cf. Equation (3)]. Therefore, we achieved the shot-noise limit in the ensuing analysis when η ′ t = 100%. Various physical phenomena, in practice, induce efficiency losses that degrade the achievable SNR, which made the shot-noise limit extremely difficult to achieve. For example, we included one such loss, the quantum efficiency of the FPA, in the definition of m S in Eq. (3); thus, one might refer to the shot-noise limit defined in this paper as the quantum limit. To account for other efficiency losses, we used the total-system efficiency η t [cf. Equations (1) and (2)], and deconstructed it into independent multiplicative terms [33][34][35].
For simplicity in the analysis, we deconstructed the total-system efficiency η t into two-major efficiencies, such that η t = η m η n , where η m is the mixing efficiency and η n is the noise efficiency. Note that η m characterizes how well the signal and reference interfere and how well the FPA digitally records the resulting hologram. Also note that η n characterizes how much noise is above the reference shot noise.
Various physical phenomena, in practice, contribute to the mixing efficiency η m , such as the signal-reference polarization, the pixel modulation transfer function, the laser coherence, and the laboratory vibrations. Previous efforts analyzed η m in terms of independent multiplicative terms [33][34][35], but here, we only accounted for the overall η m . To estimate η m , we made use of the following calculation: where η ′ m is the estimated η m . In Eq. (6), ⟨·⟩ P again denotes a spatial average over the pupil filter in the Fourier plane, whereas ⟨·⟩ I denotes a spatial average over the entire image plane; therefore, the factor of π/ (︁ 4q 2 I )︁ is the ratio of the pupil area to the Fourier plane area. Both the reference and signal, in practice, yield excess noise that is above the reference shot noise [33]. Thus, we accounted for the total excess noise using the noise efficiency η n . To estimate η n , we made use of the following calculation: where η ′ n is the estimated η n . By definition, Eq. (7) is the ratio of the reference shot-noise variance, which is Poisson distributed, to the total noise. Therefore, when η ′ n <100% , the hologram contains more noise than the reference shot noise, and when η ′ n ≥ 100%, the hologram contains less noise than the reference shot noise. In this latter regime, we specifically overcome the shot-noise floor [cf. Equation (24)].
With Eqs. (3)-(7) in mind, in Table 1 we provide the initial estimates for the baseline dataset (cf. Figure 1), where the ± denotes the standard deviation. Here, η ′ t ≈ η ′ m η ′ n , which supports the background details presented throughout this section. Table 1. Initial estimates for the baseline dataset (cf. Figure 1, where t i = 100 µs and ∆ℓ = 0 m).

Calculation
Initial Estimates Eq.

Frame subtraction
We can describe the mean digital-hologram frame, m H (x, y), as where m R (x, y) is the mean reference-only frame, m S (x, y) is the mean signal-only frame, β is the irradiance to photoelectron conversion factor, U R is the reference field, U S is the signal field, and the superscript asterisks denote complex conjugates. Since m R (x, y) and m S (x, y) contribute to the total excess noise, we can subtract these frames from m H (x, y) to minimize the excess reference and signal noise, respectively. This frame subtraction, in turn, boosts the SNR by increasing the noise efficiency while keeping the mixing efficiency relatively constant (i.e., unchanged). We can quantify this last statement using the subtracted-total gain, γ st , and the subtracted-noise gain, γ sn , respectively, such that and γ sn = 10 log 10 In Eqs. (9) and (10), η ′ st is the final estimated total-system efficiency after frame subtraction, η ′ t is the initial estimated total-system efficiency [cf. Equation (5) and Table 1], S/N ′ s is the estimated SNR after frame subtraction, S/N ′ is the initial estimated SNR [cf. Equation (4) and Table 1], η ′ sn is the final estimated noise efficiency after frame subtraction, and η ′ n is the initial estimated noise efficiency [cf. Equation (7) and Table 1].
In what follows, we quantify the effects of frame subtraction via the subtracted-total gain, γ st , and the subtracted-noise gain, γ sn . We do so by subtracting the mean reference-only frame, m R (x, y), and the mean signal-only frame, m S (x, y), from the mean digital-hologram frame, m H (x, y), prior to demodulation (i.e., before performing an inverse Fourier transform and filtering the appropriate pupil function in the Fourier plane). First, we calculate γ st and γ sn by subtracting m R (x, y) and m S (x, y) independently from m H (x, y). Then, we calculate γ st and γ sn when we subtract both m R (x, y) and m S (x, y) from m H (x, y). Based on these calculations, we find that γ st ≈ γ sn to the first decimal place. This outcome says that the estimated mixing efficiency stays relatively unchanged with frame subtraction. Thus, we conclude that frame subtraction has minimal effects on the estimated mixing efficiency.
Before moving on in the analysis, it is important to note that this section only presents results for the baseline dataset (cf. Fig. 1, where t i = 100 µs and ∆ℓ = 0 m) because the results for the other digital-holography datasets yielded the same conclusions.

Mean reference-only frame subtraction
Recall that the non-uniform reference yields excess noise in the mean digital-hologram frame, m H (x, y) [see Fig. 1 (a) and (c)]. Since it is a straightforward image-post-processing technique, we specifically used frame subtraction to perform non-uniformity correction. Ideally, subtracting the mean reference-only frame, m R (x, y), from m H (x, y) should remove this lack of uniformity in the reference and the associated excess noise, thus boosting the SNR by increasing the noise efficiency while keeping the mixing efficiency relatively constant. In turn, we tried different types of frame subtraction.
With respect to the reference-only frames, the most effective type of frame subtraction that we tried was to subtract the mean reference-only frame, m R (x, y), from the individual reference frames. As such, the mean reference-subtracted, reference-only frame, m (−R) R (x, y), took the following form: where m R (x, y) is an individual reference-only frame. Note that m R (x, y) in Eq. (11) represents the mean reference-only frame from a 20-frame file recorded sequentially with m R (x, y), and m (−R) R (x, y) is the mean over 200 frames (i.e., we used ten separate 20-frame files). We performed the frame subtraction this way because using m R (x, y) over 200 frames was less effective, as discussed below.
With Eq. (11) in mind, we observed a residual difference on the order of ±100 pe across m (−R) R (x, y), which in comparison to Fig. 1 (a), was much improved. Additionally, we demodulated each m (−R) R (x, y) frame and took the mean of the Fourier plane energy to provide E Fig. 2 (a). In comparison to Fig. 1 (d), we observed that the low-spatial-frequency in the pupil filter for Fig. 1 (d) was 3,483 ± 18 pe 2 . Therefore, these results show that the frame subtraction did remove some of the excess noise caused by the non-uniform reference.
As previously mentioned, when we defined m R (x, y) as the mean over 200 frames in Eq. (11), the frame subtraction was less effective. Even though there were no observable differences in the 200 reference-only frames, there were noticeable differences after frame subtraction, such as m (−R) R (x, y) having appreciable residual differences and E features. In addition, the mean value of E in the pupil filter was 2,946 ± 158 pe 2 . These differences suggest that there were some minor-temporal changes to the lack of uniformity in the reference. We believe these changes could be due to a drift in the MO laser's center frequency, since we have measured it to drift as much as 240 Hz/s over 30 minutes [35], which is about the amount of time it took to record the baseline dataset (cf. Figure 1, where t i = 100 µs and ∆ℓ = 0 m). Since the lack of uniformity in the reference is mostly due to the etalon-interference pattern caused by the FPA's coverglass, a change in wavelength would cause the resultant fringes to change.  With respect to the digital-hologram frames, the most effective type of frame subtraction that we tried was to subtract the mean reference-only frame, m R (x, y), from the individual digital-hologram frames. As such, the mean reference-subtracted, digital-hologram frame, m (−R) H (x, y), took the following form: where m H (x, y) is an individual digital-hologram frame. Note that m R (x, y) in Eq. (12) represents the mean reference-only frame from the 20-frame file recorded sequentially after the corresponding digital-hologram frame, m H (x, y), and m (−R) H (x, y) is the mean over 200 frames (i.e., we used ten separate 20-frame files).
With Eq. (12) in mind, we observed more uniformity across m (−R) H (x, y), which in comparison to Fig. 1 (c), means that we removed some of the excess noise due to the non-uniform reference. Next, we demodulated each m (−R) H (x, y) frame and took the mean of the Fourier plane energy to provide E , as shown in Fig. 2 (b). In comparison to Fig. 1 (f), we observed that the low-spatial-frequency features, apparent in E H (︁ f x , f y )︁ , disappeared. We then calculated the subtracted-total gain, γ st [cf. Equation (9)], and the subtracted-noise gain, γ sn [cf. Equation (10)], which resulted in values of 0.3 dB for both, as shown in Table 2. Overall, the performance increase was less than expected. To help quantify this last statement, it is important to note that the non-uniform reference contributed about 30% of the total excess noise. Thus, if we effectively removed all of the excess noise due to the non-uniform reference, then we would have expected S/N ′ s and η ′ sn to increase to 135 and 54.3%, respectively, with γ st ≈ γ sn ≈ 1.1 dB. These values are not what we report in Table 2; nonetheless, the mean reference-only frame subtraction did, in fact, boost the SNR by increasing the noise efficiency while keeping the mixing efficiency relatively constant.

Mean signal-only frame subtraction
Recall that the signal also yields excess noise in the mean digital-hologram frame, m H . This excess noise is due to the pupil filter partially sampling a chat-like feature from the autocorrelation of the signal in the Fourier plane during demodulation [see Fig. 1 (e) and (f)]. Ideally, subtracting the mean signal-only frame, m S , from m H should remove this chat-like feature and the associated excess noise, thus boosting the SNR by increasing the noise efficiency while keeping the mixing efficiency relatively constant. In turn, we tried different types of frame subtraction. With respect to the signal-only frames, the most effective type of frame subtraction that we tried was to subtract the mean signal-only frame, m S , from the individual signal frames. As such, the mean signal-subtracted, signal-only frame, m (−S) S (x, y), took the following form: where m S (x, y) is an individual signal-only frame. Note that m S (x, y) in Eq. (13) represents the mean signal-only frame from a 20-frame file recorded sequentially with m S (x, y) for the same speckle realization, and m (−S) S (x, y) is the mean over 200 frames (i.e., we used ten separate speckle realizations).
With Eq. (13) in mind, we observed a residual difference on the order of 1 × 10 −14 pe across m (−S) S (x, y), which in comparison to Fig. 1 (b), was negligible. Additionally, we demodulated each m (−S) S (x, y) frame and took the mean of the Fourier plane energy to provide E , as shown in Fig. 3 (a). In comparison to Fig. 1 (e), we observed that the chat-like feature, apparent in E S (︁ f x , f y )︁ , mostly disappeared, but a small, doughnut-shaped residual remained on the order of 3 pe 2 . This doughnut-shaped residual was observable in the individual E (−S) S (︁ f x , f y )︁ frames; thus, we believe that there were some minor-temporal changes to the nearly Gaussian beam used for the illuminator in the experimental setup. For comparison, the mean value of E S (︁ f x , f y )︁ in the pupil filter was 2,611 pe 2 and the mean value of E in the pupil filter was 106 pe 2 . Therefore, these results show that the signal-only frame subtraction did, in fact, remove the majority of the excess noise caused by the chat-like feature.
With respect to the digital-hologram frames, the most effective type of frame subtraction that we tried was to subtract the mean signal-only frame, m S (x, y), from the individual digital-hologram frames. As such, the mean signal-subtracted, digital-hologram frame, m (−S) H (x, y), took the following form: where m H (x, y) is an individual digital-hologram frame. Note that m S (x, y) in Eq. (14) represents the mean signal-only frame from a 20-frame file recorded sequentially with m H (x, y) for the same speckle realization, and m (−S) H (x, y) is the mean over 200 frames (i.e., we used ten separate speckle realizations).  H (x, y), which in comparison to Fig. 1 (c), means that we removed some of the excess noise due to the signal. Next, we demodulated each m (−S) H (x, y) frame and took the mean of the Fourier plane energy to provide E , as shown in Fig. 3 (b). In comparison to Fig. 1 (f), we observed that the chat-like feature, apparent in E H (︁ f x , f y )︁ , mostly disappeared. We also observed that the doughnut-shaped residual was negligible (i.e., it was much less than the reference shot noise).
We then calculated the subtracted-total gain, γ st [cf. Equation (9)], and the subtracted-noise gain, γ sn [cf. Equation (10)], which resulted in values of 1.9 dB for both, as shown in Table 2. Overall, the performance increase was less than expected. To help quantify this statement, it is important to note that due to the pupil filter partially sampling the chat-like feature during demodulation, the signal contributed about 70% of the total excess noise. Thus, if we effectively removed all of the excess noise due to the chat-like feature, then we would have expected S/N ′ s and η ′ sn to increase to 192 and 76.8%, respectively, with γ st ≈ γ sn ≈ 2.7 dB. These values are not what we report in Table 2; nonetheless, the mean signal-only frame subtraction did, in fact, boost the SNR by increasing the noise efficiency while keeping the mixing efficiency relatively constant.

Mean reference-and signal-only frame subtraction
To build on the results presented in Figs. 2 and 3, we combined the mean reference-only frame subtraction with the mean signal-only frame subtraction. In turn, the mean reference-and signal-subtracted, digital-hologram frame, m (−RS) H (x, y), took the following form: where m H (x, y) is again an individual digital-hologram frame. Here, m R (x, y) and m S (x, y) are the mean reference-only frame and the mean signal-only frame, respectively, from a 20-frame file recorded sequentially with m H (x, y) for the same speckle realization, and m (−RS) H (x, y) is the mean over 200 frames (i.e., we used ten separate speckle realizations).
With Eq. (15) in mind, we observed more uniformity across m (−RS) H (x, y), as shown in Fig. 4 (a), which in comparison to Fig. 1 (c), means that we removed some of the total excess noise due to the signal and reference. Next, we demodulated each m (−RS) H (x, y) frame and took the mean of the Fourier plane energy to provide E , as shown in Fig. 4 (b). In comparison to Fig. 1 (f), we observed that the low-spatial-frequency features and the chat-like feature, apparent in E H (︁ f x , f y )︁ , mostly disappeared. Again, we calculated the subtracted-total gain, γ st [cf. Equation (9)], and the subtracted-noise gain, γ sn [cf. Equation (10)], which resulted in values of 2.4 dB for both, as shown in Table 2. Overall, the performance increase was better than expected. To help quantify this statement, we expected γ st and γ sn to be the sum of the gains achieved from the mean reference-only frame subtraction and the mean signal-only frame subtraction independently, which would have been 2.2 dB. However, the gain from this combination was 2.4 dB. Even though we did not achieve the shot-noise limit, the mean reference-and signal-only frame subtraction did, in fact, boost the SNR by increasing the noise efficiency while keeping the mixing efficiency relatively constant.

Frame averaging
Frame averaging is a straightforward image-post-processing technique, which when effectively used with multi-shot digital holography data, boosts the SNR by decreasing the noise. If we assume that the collected digital-holography datasets are shot-noise limited, then the SNR directly depends on the signal strength [cf. Equation (3)]. Therefore, the SNR boost due to frame averaging should scale with the number of frames averaged; however, there are practical limitations to this last statement. One such limitation is that the digital-hologram fringes must be stable from frame to frame. With the potential benefits of frame averaging in mind, we wanted to investigate two-independent phenomena that affect the stability of the digital-hologram fringes: (i) laboratory vibrations and (ii) optical-path-length differences between the reference and signal.
(i) Laboratory vibrations cause the digital-hologram fringes to fluctuate across the FPA pixels. When these fringe fluctuations occur during the camera-integration time, t i , the digital-hologram fringes start to wash out and cause an efficiency loss that degrades the achievable SNR. A previous effort quantified the effects of laboratory vibrations for the experimental setup used in this paper [35]. In particular, when t i = 100 ms, laboratory vibrations cause an efficiency loss of 6%, whereas when t i = 100 µ, laboratory vibrations are negligible.
(ii) Optical-path-length differences between the reference and signal also cause the digitalhologram fringes to fluctuate across the FPA pixels. A previous effort quantified the effects of optical-path-length differences for the experimental setup used in this paper [35]. In practice, if the optical-path-length difference, ∆ℓ, is greater than zero, then the relative phase difference between the reference and signal fluctuates, which causes fringe fluctuations. The degree of the fringe fluctuations, of course, depends on the MO laser's coherence length with respect to ∆ℓ.
In what follows, we examine the effects of (i) and (ii) on the stability of the digital-hologram fringes while performing frame averaging. For this purpose, we analyze four digital-holography datasets with a combination of t i = 100 µs and 100 ms for the camera-integration times and ∆ℓ = 0 m and 248 m for the optical-path-length differences. To quantify the boost in the SNR due to frame averaging, we calculate the appropriate gain as a function of the number of frames averaged. For this purpose, denotes frame averaging without frame subtraction and denotes frame averaging with frame subtraction. Here, m H (x, y) is again an individual digitalhologram frame, and m R (x, y) and m S (x, y) are again the mean reference-only frame and the mean signal-only frame, respectively, from a 20-frame file recorded sequentially with m H (x, y) for the same speckle realization. Note that in the following frame-averaging results, we calculated the mean and standard deviation over ten speckle realizations. Also note that frame averaging across different speckle realizations did not produce useful results due to the time lapse and lack of stability in the digital-hologram fringes between the dataset recordings.

Mixing and noise gain results
To characterize the effects of frame averaging on the mixing efficiency, we calculated the averaged-mixing gain, γ am , and the averaged-subtracted-mixing gain, γ asm , such that γ am = 10 log 10 and γ asm = 10 log 10 In Eq. (18), η ′ am is the final estimated mixing efficiency after frame averaging, whereas in Eq. (19), η ′ asm is the final estimated mixing efficiency after frame subtraction and averaging. For both Eq. (18) and (19), η ′ m is the initial estimated noise efficiency [cf. Equation (6) and Table 1]. Similarly, to characterize the effects of frame averaging on the estimated noise efficiency, we calculated the averaged-noise gain, γ an , and the averaged-subtracted-noise gain, γ asn , such that γ an = 10 log 10 and γ asn = 10 log 10 In Eq. (20), η ′ an is the final estimated noise efficiency after frame averaging, whereas in Eq. (21), η ′ asn is the final estimated noise efficiency after frame subtraction and averaging. For both Eq. (20) and (21), η ′ n is the initial estimated noise efficiency [cf. Equation (7) and Table 1]. With Eqs. (18)- (21) in mind, Fig. 5 shows frame-averaging results for these mixing and noise gain calculations.
Referencing Fig. 5, the frame averaging greatly improved the noise efficiency, especially when we first included the benefits of frame subtraction. On average the gain was 6.1 dB or 202% ± 60% with frame subtraction. However, frame averaging was detrimental to the mixing efficiency, especially when the digital-hologram fringes were less stable. Fig. 5. Frame-averaging results showing the averaged-mixing gain, γ am , the averagedsubtracted-mixing gain, γ asm , the averaged-noise gain, γ an , and the averaged-subtracted-noise gain, γ asn , all as a function of the number of frames averaged [cf. Equations (18)- (21), respectively]. The data points display the mean over 10 speckle realizations, whereas the error bars display the standard deviation. For the mixing-gain calculations, there is no observable difference between the case with frame subtraction and the case with no frame subtraction; thus, γ asm = γ am in a single line. Here, we display results for four digital-holography datasets with a combination of camera-integration times (t i ) and optical-path-length differences (∆ℓ), such that in (a) t i = 100 µs and ∆ℓ = 0 m, in (b) t i = 100 ms and ∆ℓ = 0 m, in (c) t i = 100 µs and ∆ℓ = 248 m, and in (d) t i = 100 ms and ∆ℓ = 248 m.
To make sense of this last point, we needed to look at the details associated with all four digital-holography datasets. For example, when t i = 100 µs and ∆ℓ = 0 m [cf. Figure 5 (a)], the digital-hologram fringes were the most stable, since the estimated mixing efficiencies only decreased from 38% to 36%. On the other hand, when t i = 100 ms and ∆ℓ = 0 m [cf. Figure 5 (b)], we incurred laboratory vibrations with the longer camera-integration time (t i ), and the estimated mixing efficiencies decreased more from 32% to 23%. Furthermore, when t i = 100 µs and ∆ℓ = 248 m [cf. Figure 5 (c)], we induced a long optical-path-length difference (∆ℓ), and the estimated mixing efficiencies decreased even more from 33% to 17%. Therefore, when t i = 100 ms and ∆ℓ = 248 m [cf. Figure 5 (d)], we incurred laboratory vibrations and induced optical-path-length differences, such that the frame averaging was the most deleterious, decreasing the mixing efficiency from 15% to 1.6%, where it hovered after just a few frames being averaged. These outcomes signify that the digital-hologram fringes were increasingly less stable across all four digital-holography datasets.
Before moving on in the analysis, first note that frame subtraction provides no observable impact on the mixing efficiency, which is the reason why we only show one line for mixing-gain calculations in Fig. 5. Also note that we include a 20-frame summary of the frame-averaging results presented in this subsection, specifically without frame subtraction, in Table 3, and specifically with frame subtraction, in Table 4.

Total gain results
To characterize the effects of frame averaging on the total-system efficiency, we calculated the averaged-total gain, γ at , and the averaged-subtracted-total gain, γ ast , such that γ at = 10 log 10 and γ ast = 10 log 10 In Eq. (22), η ′ at is the final estimated total-system efficiency after frame averaging and S/N ′ a is the final estimated SNR after frame averaging, whereas in Eq. (23), η ′ ast is the final estimated total-system efficiency after frame subtraction and averaging, and S/N ′ as is the final estimated SNR after frame subtraction and averaging. For both Eq. (22) and (23), η ′ t is the initial estimated total-system efficiency [cf. Equation (5) and Table 1], and S/N ′ is the initial estimated SNR [cf. Equation (4) and Table 1]. Additionally, we calculated the gain needed to surpass the shot-noise floor, γ SNF , and thereafter achieve the shot-noise limit, γ SNL , such that γ SNF = 10 log 10 ( and γ SNL = 10 log 10 where η ′ n is the initial estimated noise efficiency [cf. Eq. (7) and Table 1] and η ′ t is the initial estimated total-system efficiency [cf. Eq. (5) and Table 1]. Recall that we defined the shot-noise limit as the gain needed to boost the SNR, such that it equals the ideal, shot-noise-limited SNR, S/N i , given in Eq. (3). With Eqs. (22) and (23) in mind, Fig. 6 shows frame-averaging results for these total-gain calculations relative to the gain needed to surpass the shot-noise floor and thereafter achieve the shot-noise limit [cf. Equation (24) and (25)].
Referencing Fig. 6, we clearly achieved the shot-noise limit when the fringes were the most stable, and we first included the benefits of frame subtraction, in addition to the frame averaging. In particular, when t i = 100 µs and ∆ℓ = 0 m [cf. Figure 6 (a)]. We were still able to boost the SNR when laboratory vibrations and optical-path-length differences were independently present [cf. Figure 6 (b) and (c), respectively]. When both of the aforementioned effects were present, however, the frame subtraction and averaging was deleterious after averaging just two frames because the digital-hologram fringes were unstable from frame to frame [cf. Figure 6 (d)]. This Fig. 6. Frame-averaging results showing the averaged-total gain, γ at , and the averagedsubtracted-total gain, γ ast , relative to the gain needed to surpass the shot-noise floor, γ SNF , and thereafter achieve the shot-noise limit, γ SNL , all as a function of the number of frames averaged [cf. Equations (22)- (24), respectively]. The data points display the mean over 10 speckle realizations, whereas the error bars display the standard deviation. Here, we display results for four digital-holography datasets with a combination of camera-integration times (t i ) and optical-path-length differences (∆ℓ), such that in (a) t i = 100 µs and ∆ℓ = 0 m, in (b) t i = 100 ms and ∆ℓ = 0 m, in (c) t i = 100 µs and ∆ℓ = 248 m, and in (d) t i = 100 ms and ∆ℓ = 248 m. outcome was due to the fact that the frame averaging decreased the mixing efficiency more than the frame averaging increased the noise efficiency.
The frame-averaging results presented in Fig. 6 clearly show that frame averaging boosts the SNR by itself, but frame subtraction is a necessary first step to achieve the shot-noise limit, in addition to surpassing the shot-noise floor. In practice, frame subtraction improved the frame averaging by at least 3.4 dB across all four digital-holography datasets. To gain further insight into why frame subtraction improved the frame averaging, we show Fourier-plane results in Appendix A. and simulation results with an ideal mixing efficiency in Appendix B.
In Table 3 and Table 4, we include a 20-frame summary of the frame-averaging results presented in this subsection. It is important to note that in Table 3, γ am + γ an = γ at and in Table 4, γ st + γ asm + γ asn = γ ast (within the mean rounding error and standard deviations over 10 speckle realizations). These outcomes demonstrate the completeness of the detailed analysis presented herein.

Conclusion
In this paper, we achieved the shot-noise limit using straightforward post-processing techniques with experimental multi-shot digital holography data (i.e., off-axis data composed of multiple noise and speckle realizations). First, we quantified the effects of frame subtraction (of the mean reference-only frame and the mean signal-only frame from the digital-hologram frames), which boosted the signal-to-noise ratio (SNR) of the baseline dataset with a gain 2.4 dB. Next, we quantified the effects of frame averaging, both with and without the frame subtraction. We then showed that even though the frame averaging boosted the SNR by itself, the frame subtraction was a necessary first step in order to beat the shot-noise limit. This outcome was due to the autocorrelation of the signal in the Fourier plane, which resulted from collecting the multi-shot digital holography data in an off-axis recording geometry. We also showed that the effectiveness of the frame averaging depends on the stability of the digital-hologram fringes. Overall, we boosted the SNR of the baseline dataset with a gain of 8.1 dB, which was the gain needed to achieve the shot-noise limit.

Appendix A.
To gain insight into why frame subtraction improved the frame averaging, this appendix illustrates the noise reduction in the Fourier plane from frame averaging in the presence of a strong, chat-like feature. Recall that this chat-like feature manifests in the Fourier plane due to the autocorrelation of the signal. With a strong, chat-like feature present, we compared the mean Fourier-plane energy in Fig. 7 (a) to the mean Fourier-plane energy with frame averaging in Fig. 7 (b). We observed that the frame averaging does decrease the overall Fourier-plane background noise, but the chat-like feature remains as strong. However, when we first included the benefits of frame subtraction, in addition to the frame averaging, as shown in Fig. 7 (c), we observed that the chat-like feature was much weaker than in Fig. 7 (a) and (b). It is important to note that in Fig. 3 (b) and Fig. 4 (b), the chat-like feature was not observable with frame subtraction, but the chat-like feature does strengthen with frame averaging, as seen in Fig. 7 (c). This outcome illustrates why frame subtraction improved the frame averaging, specifically in the presence of a strong, chat-like feature.

Appendix B.
This appendix simulates multi-shot digital holography data in the off-axis IPRG to further illustrate the benefits of the frame subtraction and averaging performed in this paper. As shown in Fig. 8, we used uniform illumination of a 1951 USAF bar chart as the object. For simplicity, we assumed far-field propagation; otherwise, all modeled parameters follow those provided in Sec. 2.1. To simulate a non-uniform reference, we used the mean reference frame from the baseline dataset [cf. Figure 1(a)] and normalized it to 2, 500 pe, which was the approximate reference strength in the experiment. We also normalized the mean signal frame to correspond to an ideal, shot-noise limited SNR, S/N i , of 2 [cf. Equation (3)]. In turn, we also simulated the effects of shot noise, read noise, and 12-bit digitization noise.
Note that we did not include the effects of speckle in these simulations. In practice, speckle causes a lot of spatial variation in plots of the 2D wrapped phase, which makes it difficult to discern whether or not the frame subtraction and averaging is adding bias or artifacts to the digital-holography datasets. This last point is the reason that we did not include plots of the 2D wrapped phase throughout the main body of this paper, but do so in this appendix.
Altogether, we simulated 20 realizations of noise and performed frame averaging both with and without frame subtraction. In accordance with the off-axis IPRG, we then demodulated the data by taking the inverse Fourier transform, filtering the pupil in the Fourier plane, and Fourier transforming back to the image plane.
In the first column of Fig. 8, the estimated SNR, S/N ′ , is 0.4 [cf. Equation (4)], whereas in the second and third columns the estimated SNRs, S/N ′ a and S/N ′ as , are 1.9 and 6.3, respectively. With that said, the demodulated data in the first column corresponds to the case with no frame subtraction or averaging. The second and third columns then correspond to the cases with frame averaging only and frame averaging with frame subtraction (hence the subscript a and as, respectively).
Also note that these simulations model an ideal mixing efficiency (i.e., η m = 100%). Thus, the frame subtraction, in addition to the frame averaging, is most effective because there are essentially no frame-to-frame discrepancies. In this case, frame subtraction almost perfectly subtracts out the excess noise. This last point is why the estimated SNR exceeds the ideal, shot-noise-limited SNR (i.e., S/N ′ as >S/N i ), while the experimental data (presented throughout the body of this paper) only achieved the ideal, shot-noise limited SNR (i.e., S/N ′ as ≈ S/N i ). Overall, the simulation results presented in Fig. 8 further illustrate the benefits of the frame subtraction and averaging performed in this paper. They also show that frame subtraction and averaging does not add any bias or artifacts to the digital-holography datasets (if performed correctly). This final point further emphasizes the novelty of the detailed analysis presented herein.