Signal averaging improves signal-to-noise in OCT images: But which approach works best, and when?

The high acquisition speed of state-of-the-art optical coherence tomography (OCT) enables massive signal-to-noise ratio (SNR) improvements by signal averaging. Here, we investigate the performance of two commonly used approaches for OCT signal averaging. We present the theoretical SNR performance of (a) computing the average of OCT magnitude data and (b) averaging the complex phasors, and substantiate our findings with simulations and experimentally acquired OCT data. We show that the achieved SNR performance strongly depends on both the SNR of the input signals and the number of averaged signals when the signal bias caused by the noise floor is not accounted for. Therefore we also explore the SNR for the two averaging approaches after correcting for the noise bias and, provided that the phases of the phasors are accurately aligned prior to averaging, then find that complex phasor averaging always leads to higher SNR than magnitude averaging.


Introduction
Optical coherence tomography (OCT) [1] performs biomedical imaging at high speed and with high sensitivity. State-of-the art OCT systems provide data rates of ∼100 million pixels per second and frame rates of ∼100-200 frames per second and beyond [2,3]. OCT provides a distinctively high sensitivity of typically ∼100 dB, meaning that backscatter signals as weak as 10 −10 of a mirror reflection can still be detected [4][5][6]. At the same time, OCT enables imaging with a very high dynamic range spanning several tens of decibels between the strongest and the weakest signal in the image.
The strong performance of OCT in detecting weak signals can be improved even more by image processing. Noise in OCT images limits the detection capabilities for weakly scattering structures, and thus a variety of different approaches for reducing OCT image noise have been proposed [7][8][9][10][11][12][13][14][15]. The high acquisition speeds of modern OCT systems enable the improvement of the signal-to-noise ratio by averaging multiple signals, for instance by fusing OCT frames or even volumes quickly repeated at the same sample position. In recent years, several methods were proposed for improving the detection sensitivity of OCT by averaging the complex-valued OCT signals rather than just averaging their magnitudes [16][17][18][19][20]. In 2013, Szkulmowski and Wojtkowski published a thorough analysis of signals and noise subject to different averaging approaches [21]. In their analysis, the authors found a much stronger reduction of the noise floor by complex averaging as compared to magnitude averaging but also observed a heterogeneous outcome in terms of signal-to-noise performance for different imaging scenarios.
In this article, we set out to answer the question: Which signal averaging approach works best for improving signal-to-noise in OCT images, and when? We first introduce signals and noise in OCT (section 2.1) based on the analysis in Ref. [21]. We then analyze the signals and noise after magnitude and complex averaging, respectively, for a given pixel in an OCT image (section 2.2) and present simple expressions for the resulting signal-to-noise ratios. Next, we compare the somewhat surprising theoretical performance of the averaging schemes for different input signal levels and for different numbers of averaged signals and introduce a noise bias corrected signal-to-noise ratio (SNR) analysis (section 2.3). After substantiating our theoretical analysis with data from simulations (section 3.1) and experimentally acquired OCT data (section 3.2), we conclude the paper with a brief summary of the findings and their implications on actual OCT image processing (section 4). An overview of the terminology as well as a brief section on phase correction required for complex phasor averaging is provided in the appendix.

OCT signals and noise
OCT signals S OCT can be described by complex phasors of the form where A(x, t) denotes the amplitude and φ(x, t) the phase of the OCT signal at location x and time t. Here the amplitude A(x, t) represents the length of the phasor and the phase φ(x, t) corresponds to the polar angle in the complex plane. The noise S noise in OCT images can be specified by phasors too, namely by random phasors. In the complex plane, the real and imaginary parts of random phasors, r noise and i noise , can be described by normal distributions with zero mean and σ 2 variance [22]. Hence, the complex noise signal is characterized by a binormal (or Beckmann) distibution p ri in the complex plane [22,23]: This 2D Gaussian probability density function (PDF) describes the probability for a noise phasor of an image pixel to have the real part r noise and the imaginary part i noise . The probability density function p A of the noise amplitude A noise is represented by the Rayleigh distribution (which can be derived by transforming Eq. (2) to polar coordinates and integrating over all angles φ) [22] p A (A noise ) = A noise Analogous to p ri , the PDF of the noise amplitude represents the probability of the noise amplitude to take specific values A noise . Figure 1(a-d) shows an example of the complex representation of noise phasors in an OCT image and the corresponding Beckmann and Rayleigh distributions. Note that the Beckmann distribution is symmetric and centered at the origin while the Rayleigh distribution is skewed and only takes positive values. The mean amplitude u and variance σ 2 U of a general PDF p U (u) is given by [22]. The mean amplitude A noise and the variance σ 2 noise of p A can be computed as In every OCT image pixel, the detected signal is a combination of the actual OCT signal S OCT and the noise S noise . By choosing the OCT signal to have φ = 0 and thus to point into the direction of the positive real axis ( Fig. 1(e)) similar to Goodman [22], the PDF of the measured amplitudes incorporating both signal and noise is represented by a Rice distribution where B 0 is a modified Bessel function of the first kind and zero order. An example of the PDF of the signal amplitudes in presence of noise seen in an OCT image is shown in Fig. 1(g). Note that the signal gives rise to a somewhat spread probability accumulation at a high amplitude level. The mean amplitude A and variance σ 2 A in an image pixel are observed as where L 1/2 (·) denotes the Laguerre polynomial of degree 1/2. L 1/2 (−x) yields steadily increasing, positive values for increasingly negative arguments −x.
OCT amplitude data are usually squared in order to display a quantity proportional to sample reflectivity, I = A 2 , henceforth called intensity. Therefore it makes sense to investigate the behavior of the PDFs of the squared signal and noise amplitude data. For this purpose, a variable I is performed such that the PDFs are rendered into p I (I) = p A (A = √ I)| dA dI |. The resulting PDFs for noise (p I (I noise )) and signal intensities (p I (I, I noise )) are where I noise = A 2 noise . Note that p I (I noise ) is a negative exponential probability function whereas p I (I, I noise ) is a combination of a negative exponential and a monotonically increasing Bessel function. Examples of the above PDFs for noise and signal intensities are shown in Figs. 1(d) and (h), respectively. The mean values and variances for noise and signal intensities, respectively, are given by By comparing Eqs. (11) and (12), one can observe that the mean noise intensity equals the standard deviation σ noise . Therefore, for the case of both non-zero signal and noise, the observed mean intensity corresponds to I = I + I noise and the intensity variance to σ 2 I = I noise (2I + I noise ). Next, we will investigate the effect of averaging the magnitudes of the phasors as well as the effect of averaging the complex phasors on OCT intensity data.

Averaging magnitudes and phasors
For averaging OCT data, usually the absolute values (magnitudes) of the signals are used. For some applications, the complex signals have been exploited for adding or increasing image contrast in one way or another [31][32][33][34][35][36]. Here, we systematically analyze the effect of averaging magnitude and complex signals in OCT images.

Magnitude averaging
Magnitude averaging uses the absolute values of N spatially and/or temporally separated signals (e.g., N pixels in a 2D or 3D kernel or N repeated measurements at the same spatial pixel location but at different time points): The PDF representing the average of N data points can be computed numerically by convolving the PDF of a single data point (N − 1)-times [21]. Unfortunately, no closed form expressions are available for the distribution describing an N-fold average of the above PDFs and only some approximations and bounds have been derived [37]. Nevertheless, the mean intensity values and intensity variances can be calculated for Rayleigh distributed noise and Rician signals affected by random noise, respectively, as The above expressions show that the mean noise level as well as the observed mean signal level remain unchanged upon averaging and retain the mean intensity values calculated for single data points. However, the variances of the observed signal and noise, σ 2 I and σ 2 I noise , are reduced by 1/N and thus reducing the standard deviations σ I and σ I noise by a factor 1/ √ N.

Complex averaging
In contrast to magnitude averaging, complex averaging also includes the phase information into the averaging process. This inclusion exploits the fact that the noise phasors (in absence of a signal) randomly fluctuate about the origin according to a 2D Gaussian PDF with a standard deviation of σ, see Eq. (2). By averaging N spatially and/or temporally independent complex noise phasors, the standard deviation σ of the Beckmann distribution in Eq. (2) is reduced by a factor 1/ √ N [22]. This reduction of the noise amplitude variance translates into a reduction of both the mean noise intensity as well as of its variance: Compared to the noise performance after magnitude averaging (Eqs. (16) and (17)), which maintains the mean noise level and reduces the noise intensity variance by 1/N, complex averaging leads to a 1/N decrease of the noise floor intensity and to a 1/N 2 smaller noise intensity variance.
Recalling that we initially assumed that the signal vector S OCT was pointing in the direction of the positive real axis (i.e. Fig. 1(e)), the precondition for averaging signals in a complex fashion is that the phase of S OCT remains constant at φ = 0. (Likewise, a different signal phase φ = φ 0 could be chosen, as long as the phases of all signals to be averaged are aligned at the same angle φ 0 . The signal phase (φ = 0) is chosen here out of convenience to ensure a purely real signal which facilitates the calculations described above.) For such perfectly aligned signals in the presence of random noise, the following mean signal intensity and intensity variance will be observed: Unlike magnitude averaging, which maintains the original mean signal intensity I, the mean signal intensity does not stay constant but is reduced by 2σ 2 (N − 1)/N after complex averaging. This signal reduction converges to an amount of 2σ 2 for large N. At the same time, also the noise intensity variance is reduced by an amount σ 2 (N − 1)/N 2 compared to magnitude averaging, which converges to a reduction by σ 2 /N for large N.

Signal-to-noise performance
The signal-to-noise ratio is the measure of choice to describe the influence of noise on signal measurements. An overview of methods for determining the SNR and in particular the sensitivity of an OCT system has recently been published by Agrawal et al. [24]. The SNR in OCT is typically defined as the ratio of the average signal intensity to the standard deviation of the noise intensity, SNR = I/σ I noise [24][25][26][27][28] (or alternatively I / σ I noise for averaged signals). As shown in the previous sections and visualized in Fig. 2, the measured signal intensity I is the sum of the pure signal intensity I and the noise floor I noise . Historically, the bias caused by the noise floor is not specifically factored out from the SNR analysis, however some investigations of OCT image data did exclude the background [29,30]. Because of the multiple approaches for OCT data processing, it makes sense to ask, "how important is noise bias correction when measuring SNR?". In the following section, we will investigate the SNR for the total measured signal (i.e. including the contribution of the noise floor) with and without averaging. Subsequently, we will account for the signal offset caused by the noise floor and evaluate the SNR with noise bias correction. OCT intensity signal I consists of the pure signal intensity I biased by the average noise level I noise .

Signal-to-noise performance without noise bias correction
Using the signal and noise pairs in Eqs. (13) and (12), (18) and (17), and (22) and (21), and the relation I noise = σ I noise = 2σ 2 in Eqs. (11) and (12), we find the following SNRs for single signals, N magnitude averaged signals, and N complex averaged signals, respectively: In short, averaging the magnitudes of N signals improves the SNR by a factor √ N whereas complex averaging of N signals changes the SNR by a factor N less (N − 1).
The relative SNR improvements SNR MAG,CPX /SNR 1 and SNR CPX /SNR MAG are plotted for N up to 100 and for typical OCT image dynamics with SNR 1 between 5 dB and 20 dB in Fig. 3.
In particular for small N, complex averaging provides a considerable SNR improvement over magnitude averaging. At the same time, the SNR 1 dependence of complex averaging has a considerable impact on its performance: While for strong signals with SNR 1 1, complex averaging outperforms magnitude averaging by a factor √ N, the signal-to-noise improvement becomes much less for weaker input signals.

Fig. 3.
Relative SNR improvement by signal averaging for strong input signals (without noise bias correction). (a) The ratios SNR CPX /SNR 1 and SNR MAG /SNR 1 are shown for N from 1 to 100. SNR MAG /SNR 1 is plotted as a dash-dotted line, whereas the SNR 1 -dependent ratio SNR CPX /SNR 1 is plotted in rainbow colors for several SNR 1 values between 5 dB and 50 dB. Note that SNR CPX /SNR 1 converges to an N- N is plotted for the spectrum of SNR 1 values used in (a). Note that in particular for strong input signals with high SNR 1 , complex averaging outperforms magnitude averaging and converges to a √ N-fold better SNR performance. As the SNR profiles converge for large SNRs, the curves in (a) and (b) start to overlap for values greater than 15 dB.
In the next section, we are going to explore the SNR enhancement of the two averaging approaches particularly for very weak signals on the order of the noise background, however still without correcting for the signal bias caused by the noise floor. Will magnitude averaging or complex averaging be superior in terms of recovering such small signals?

Averaging domains and borderline SNR
When the signal bias caused by the noise is not accounted for, the SNR performance averaging depends on the approach taken. Equations (25) and (26) revealed a dependence on the number of averaged signals, N, and -for complex averaging -on the SNR of the input signals, SNR 1 . In this section, we are investigating this dependence for different numbers of averaged signals, N, and for a great range of signals -from much smaller than the noise variance to several orders of magnitude greater. Recalling the identities for the noise variance and the observed signal intensity in Eqs. (12) and (13), we choose the quantity I/2σ 2 , which is identical to SNR 1 − 1, as the benchmark for the input signal. Figure 4(a) shows three plots of the signal-to-noise ratios calculated for magnitude averaging (red) and complex averaging (blue) where relatively small signals I/2σ 2 = 1, 0.5 and 0.1 were chosen as the respective inputs. Note that for I = 2σ 2 (left plot), complex averaging always yields greater SNR than magnitude averaging for N>1. However, for I/2σ 2 <1 and a small number of signals N, magnitude averaging provides a more effective SNR improvement. Figure 4(b) plots three pairs of SNR profiles for a fixed number of averaged signals (N = 2, 10 and 100), this time as a function of the relative signal intensity I/2σ 2 . Again, for weak input signals, the superior performance of magnitude averaging can be observed, while complex averaging excels beyond a crossover point of I/2σ 2 = 1/ √ N. This relative borderline signal is plotted for N up to 100 in Fig. 4(c) alongside the borderline input SNR 1 which is the input SNR 1 for which magnitude averaging and complex averaging perform equally well. An overview of the relative SNR performance SNR CPX /SNR MAG at various inputs I/2σ 2 and N is shown as a heat map in Fig. 5. At first glance, two domains can be differentiated: For stronger signals I/2σ 2 and larger N, complex averaging provides a greater SNR improvement than magnitude averaging (see the area in blue in Fig. 5). For signals comparable to or smaller than the noise level, magnitude averaging yields a better SNR enhancement (see the area in red).
In between these two domains, the borderline with equal SNR performance of the two approaches is indicated in white, SNR CPX /SNR MAG = 1.

Fig. 5.
Relative SNR performance for magnitude and complex averaging (without noise bias correction). The heat map displays the ratio SNR CPX /SNR MAG in decibels for relative input signals I/2σ 2 ranging from -20 dB to +40 dB and up to a number of averaged signals N = 100. For small input signals below the noise level, magnitude averaging yields a better SNR improvement (red range), while complex averaging performs better for greater N and stronger input signals (blue range). The borderline SNR where SNR MAG = SNR CPX separates these two domains (white plot).

SNR in absence of a signal (noise floor only)
An interesting scenario is posed by the calculation of the SNR when the signal intensity I is zero. Then, using the relation I noise = σ I noise = 2σ 2 from Eqs. (11) and (12), the SNRs for a single signal, after magnitude averaging and after complex averaging, respectively, become: An obvious disunity of the SNRs calculated for pure noise signal can be observed. This calls for a revised definition of the SNR, this time accounting for the signal bias I noise imposed by the noise floor.

SNR analysis with noise bias correction
In the limit of a very weak OCT signal, the small meaningful signal contribution sits on top of a comparatively large noise floor. This noise bias is evident as the term related to I noise in Eqs. (24)(25)(26) and is also visualized in Fig. 2. Thus, it makes sense to actually consider the impact of this noise bias in the SNR analysis -in particular for weak signals. In this section, we introduce and evaluate an SNR analysis with noise bias correction. A similar approach has for instance also been used by Makita et al. [39] and recently been modified by our group [40] to correct for noise contributions in PS-OCT images and improve the computation of the degree of polarization uniformity for weak signals. To some extent, this SNR with noise bias correction resembles the SNR definitions in Refs. [29,30] and has formal similarities with previous definitions of the contrast-to-noise ratios in Refs. [21,27]. The noise bias of the average OCT signals can be removed by subtracting the average noise intensity (I noise , I noise MAG , and I noise CPX in Eqs. (11), (16) and (20), respectively) from the average signal intensity (I, I MAG , and I CPX in Eqs. (13), (18) and (22), respectively). The noise bias corrected average signal intensities then read Note that the three corrected average signal intensities now match the pure signal intensity I. Further, using the noise variances in Eqs. (12), (17) and (21), the respective noise bias corrected SNRs can be calculated as SNR 1 equates the quantity SNR − 1 = I/2σ 2 already known from the analyses in the previous sections. Even more strikingly, the noise bias corrected SNRs after magnitude and complex averaging now exhibit a simple proportionality of √ N and N with SNR 1 (Fig. 6), which also means that SNR will be zero if there is no signal for Eqs. (34)(35)(36). Using the noise bias correction, complex averaging thus always yields a √ N-times better SNR than magnitude averaging, even for weak input signals and small N.

Numerical simulation
OCT signals and noise were simulated according to the probability density functions described in Eqs. (3) and (6), respectively. For the noise contribution, binormally distributed complex signals with similar standard deviations σ along both the real axis and the imaginary axis were generated. For the signal contribution, the binormal distribution was generated around a real-valued signal A such that the phasor distribution was shifted from the origin of the complex plane to (A, 0) (see also Fig. 1(e)). A total of M = 100 averaged signals were computed for every simulation run. For complex averaging, in every run N = 1 to N = 100 complex-valued phasors were averaged. For magnitude averaging, the absolute values were computed for every single noise and signal phasor prior to averaging. Finally, the respective SNRs and SNR s were calculated from the average signal intensity and the variance of the noise signals for every N.
Results of the simulations performed in MATLAB (R2014a, MathWorks) are shown in Fig. 7. To showcase the effect of averaging for strong and weak signals, SNR simulations for two distinct relative signal levels of I/2σ 2 = 10 and I/2σ 2 = 0.1 are presented in Fig. 7(a) and (b), respectively. Unlike magnitude averaging, complex averaging markedly decreased the signal intensity but at the same time also reduced noise much more by averaging. For the standard Fig. 7. Simulation of the effect of averaging N signals with relative strength I/2σ 2 = 10 in (a) and I/2σ 2 = 0.1 in (b). Shown are the averaged signal-to-noise ratios calculated from N simulated phasors (•) alongside the corresponding theoretical plots (−). The SNRs without and with noise bias correction are plotted in the left and right panels, respectively. Without noise bias correction, the SNR CPX shows a better performance for the strong signal in (a), whereas SNR MAG dominates for N = 1 to 100 both for theoretical calculation and simulation. With noise bias correction (rightmost column), complex averaging similarly provides an N-fold improvement of the respective SNR = I/2σ 2 whereas an √ N-fold improvement can be observed for magnitude averaging. deviation of the noise fluctuations, a decrease by 1/ √ N and 1/N was observed for magnitude and complex averaging, respectively. The resulting SNRs are shown in the four panels in Fig. 7. Here, for averaging up to 100-fold, SNR CPX reveals a better SNR improvement for the strong input signal (Fig. 7(a)) when the noise bias is not corrected for, while magnitude averaging appears more effective for the weaker input signal in Fig. 7(b). After noise bias correction, however, no more dependence upon the input signal level can be observed, and complex averaging provides an N-fold increase in SNR whereas the SNR is only enhanced by a factor √ N after magnitude averaging.

Averaging experimental OCT data
For demonstrating the effect of the different averaging approaches on OCT images, we used a spectral domain (SD) OCT system in our lab [38]. This polarization sensitive SD-OCT system was used for imaging a stationary eye phantom. This system is based on a multiplexed superluminescent diode (λ = 840 nm, ∆λ = 100 nm) as a light source, a free-space Michelsontype interferometer, and a polarization-sensitive detection unit including two spectrometers. For the investigations presented here, only the co-polarized detection channel was analyzed; thus the PS-OCT system was reduced to what would be considered a standard SD-OCT with high-resolution imaging capabilities -3.6 µm axial resolution (assuming a refractive index of 1.35), 83 kHz A-scan rate -similar to state-of-the-art commercial SD-OCT scanners for retinal imaging.
The eye phantom, composed of several layers of transparent nail polish on a glass bead to produce a laminar reflectance pattern [38], was imaged using B-scan and M-scan protocols at different levels of attenuation (from 0 dB to -40 dB) to observe the effects of averaging in low-signal conditions. The B-scan protocol scanned a 1 mm lateral range with 100 repeats to generate cross-sectional images of the phantom (Fig. 8(a-d)). Representative depth profiles are shown in Fig. 8(e).
The M-scan protocol was used to acquire 1000 repeated A-lines from a fixed position within the phantom for a precise comparison of averaging methods. The M-scan protocol was chosen to minimize phase differences between consecutive A-lines and ensure that averaging represents a best-case real-world scenario. The 1000 A-lines were split into 10 temporally independent bins of 100 consecutive A-lines each. Complex and magnitude averaging for N up to 100 was performed on each of the 10 bins and the resulting signal curves were averaged together before calculating SNR and SNR in order to yield more stable SNR curves, particularly for lower numbers of averages. Here, the noise variances were computed from the noise pixel data of 100 consecutive A-lines, similar to the simulation above. Three characteristic SNR and SNR curves from the magnitude-averaging dominant, borderline, and complex-averaging dominant domains are shown in Fig. 8(f) and (g). Theoretical SNR profiles based on SNR 1 as described in Eqs. (25) and (26) demonstrate good agreement with the experimental data in Fig. 8(f). These results show that the theoretical magnitude-averaging dominant domain is reproducible with real world OCT data when the noise floor induced bias is not accounted for. Similarly, the SNR plots from the experimental data with noise bias correction follow the expected slopes of √ N and N ( Fig. 8(g)).

Discussion and conclusion
Signal averaging is one of the most commonly used procedures for OCT image processing. Here, we investigated the SNR performance of magnitude and complex averaging in theory and backed our findings with experimental results from simulations and actual OCT image data. We also studied the effect of the background noise bias on the measured SNR and calculated a noise bias corrected SNR. In the following paragraphs, the main observations are summarized and discussed in order to provide a better understanding of the strengths and weaknesses of the two approaches.

Impact of averaging on noise and signals
Complex averaging reduces the noise variance by 1/N, and thus provides a much stronger noise reduction than magnitude averaging, which only reduces the noise by 1/ √ N. At the same time, complex averaging also reduces the signal -in contrast to magnitude averaging, which maintains the signal level and only reduces noise. The main cause of the average signal decrease by complex averaging is the reduction of the average noise level. Both signal and noise characteristics have to be considered to understand the impact of averaging on OCT data.

Leaving or removing the noise floor bias for the SNR calculation
The signal-to-noise ratio is a commonly used measure to assess the image quality and to describe the system performance of OCT machines. Usually, the ratio of the average intensity of a signal peak and the standard deviation of the noise intensity is used to calculate the SNR [24,28]. While it is rather clear how to estimate the noise variance (e.g., by considering the noise intensity in a signal-free image area), the definition of the average intensity is not that obvious. As visualized in Fig. 2, taking the entire measured signal intensity from the zero line to the (average) peak intensity I includes two portions, namely the actual signal intensity term I and the background noise level I noise .
For relatively strong signals, the measured signal is dominated by the intensity term I and I ≈ I I noise . In contrast, when the actual signal contribution to the measured signal is small (i.e. I I), the noise floor bias I noise prevails. We hence investigated the signal-to-noise performance of magnitude and complex averaging for (a) leaving the noise floor bias as part of the measured signal and (b) correcting for the noise bias. As discussed in detail in the following sections, similar results were observed for both SNR analyses when strong signals were investigated. However, for rather weak signals, the impact of I noise manifested in the observation of dissimilar SNR characteristics.

Noise-afflicted signal-to-noise ratios after averaging strong and weak signals
For sufficiently large signals I = I + I noise , the N · SNR 1 term in Eq. (26) dominates such that complex averaging converges on an N-fold SNR improvement, while magnitude averaging only enhances the SNR √ N-fold. In contrast, for recovering weak signals comparable to or smaller than the noise, magnitude averaging appeared to outperform complex averaging -in particular for small N. When the input SNR 1 was just slightly larger than unity (i.e., I/2σ 2 just slightly greater than zero), the right-hand side of Eq. (26) was essentially neutralized such that the SNR improvement was far less than the √ N-fold enhancement yielded by magnitude averaging, especially in the limit of small N. Hence, magnitude averaging could be considered more effective for boosting small signals as they may be found in the outer nuclear layer in the retina when the noise floor induced signal bias is not accounted for. However, by taking a look at the SNR in absence of an actual signal (I = 0), odd SNR characteristics were observed (see section 2.3.3) which underscored the importance of a noise bias correction.

SNR calculations with noise bias correction
When SNRs are calculated in the classical way described in section 2.3.1, the average noise floor level I noise contributes to the intensity measured as "signal". This noise bias impacts the measured SNR, in particular for weak signals. Akin to noise offset removal in PS-OCT processing [39,40], we performed SNR calculations with noise bias correction in sections 2.3.4 and 3. By using this modified approach, a superior SNR performance was always observed for complex averaging, regardless of the input signal strength. Compared to the noise bias corrected SNR of a single signal prior to averaging, magnitude and complex averaging improved the SNR (after noise bias correction) by a factor of √ N and N, respectively. The theoretically predicted performance agreed well with the performance observed in simulated and experimental OCT data (Figs. 7 and  8). This finding suggests that noise bias correction definitely is an important processing step in SNR analyses -especially when it comes to averaging weak signals.

Implications for practical OCT image averaging
State-of-the-art swept source and spectral domain OCT devices deliver complex-valued signals (see Eq. (1)) right out of the box. Hence both magnitude and complex averaging can be easily implemented using the image data. For effective complex averaging, it is imperative that the phases of the signals are accurately aligned before averaging as described in more detail in Appendix B. While this means additional computational steps, it may be well worth the effort when the SNR is to be increased in images of scattering structures such as the retina where OCT has been frequently applied. Retinal OCT images may easily span a dynamic range of 40 dB with hyperscattering structures such as the nerve fiber layer and the pigment epithelium that can serve as landmarks for phasor alignment [18]. Other retinal tissues such as the ganglion cell layer and the outer nuclear layer [41], and also structures in the vitreous, which are usually much less reflective [42], may then be visualized by averaging multiple OCT images together. Complex averaging has also been found promising to detect signals in settings with multiple scattering [20].
Magnitude averaging on the other hand can be implemented at less computational expense and may be the averaging approach of choice for images containing mostly weak signals of interest, which would render phase alignment for complex averaging difficult or even impossible. Also in scenarios where images include a lot of varying motion, e.g. when visualizing weakly scattering structures such as flow in lymphatic vessels [43], albeit theoretically less effective by √ N in terms of SNR improvement, magnitude averaging may likely trump complex averaging.

Future perspectives
Finally, we would like to point out that, while the analysis presented here particularly focused on the SNR improvement for different averaging approaches, it may be interesting to also investigate other image metrics such as their contrast-to-noise characteristics and/or their efficiency in terms of speckle reduction. For this purpose and to explore scenarios with tissue-like scattering, recently described OCT signal models could be particularly interesting candidates [28,[44][45][46]. Additionally, OCT images are often displayed on a logarithmic scale in order to compress signals covering a large dynamic range. The logarithm pulls up weak signals but also the noise floor [21], such that the SNR performance for averaged logarithmic amplitude data will deviate from that for uncompressed OCT data discussed here. An in-depth analysis similar to the one performed for linear data in section 2.2 may provide more insight in the advantages and disadvantages of magnitude-and phasor-based averaging approaches for enhancing OCT image quality.

Appendix A -Glossary
The table below provides an overview of the variables and symbols used in this article. · MAG Quantity · after magnitude averaging (16)(17)(18)(19) · CPX Quantity · after complex averaging (20)(21)(22)(23) Appendix B -Complex signal averaging in case of axial motion Effect of axial motion on complex averaging Keeping the signal phase φ constant is essential for complex signal averaging. If d 2 φ(x, t)/dxdt 0, the complex sum of the signal vectors will have a length smaller than the sum of their absolute values. In other words, if the phases of the complex signals are not matched prior to averaging, the length of the average signal vector will be reduced. The condition ∆φ = 0 is fulfilled for simultaneously acquired signals at the same z-position within the same speckle. When averaging signals among different speckles acquired at the same time point, first the phase offset between different speckles has to be eliminated [48]. Similarly, when signals acquired at the same location x but at different time points t are averaged (e.g., signals from repeatedly acquired OCT images), the influence of axial motion has to be removed prior to averaging [18].
Axial motion introduces a proportional phase shift in the complex signals ( Fig. 9(a)). This phase shift has been extensively exploited for velocity measurements in Doppler OCT, for displacement measurements in OCT elastography and, recently most prominently, for some OCT angiography approaches [32][33][34][35][36]. The phase shift of an OCT signal is proportional to 2k∆z where k = 2π/λ is the central wavenumber and ∆z is the axial displacement between successive measurements. When N complex signals with displacements ∆z j relative to the first signal are averaged in a complex fashion, the resulting relative signal reduction corresponds to the penalty factor P(N): where W j = 2∆z j /λ denotes the displacement of the j-th signal in terms of wavelengths. In case of constant axial motion, the displacement will increase by (j − 1)∆z for the j-th signal compared to the first signal. The penalty factor P(N) when averaging N signals with relative separation W j = W is Examples for the relative signal intensity decrease due to P(N, W) are shown in Fig. 9(b). For large W, the upper limit of the rectified sinc-like decrease is given by 1/N.

Compensation of axial motion
In order to remove the detrimental effect of axial motion and thus to fully exploit the SNR advantage of complex averaging, the phases φ j of the signals to be averaged have to be aligned. Ju et al. proposed two approaches: one compensating phase offsets A-line-wise and another one compensating phase offsets in a pixel-wise fashion [18]. Assuming a constant "bulk" displacement of the simultaneously acquired signals within one A-scan, the bulk phase shift ∆φ bulk between the m-th A-line in a reference frame (ref ) and the corresponding m-th A-scan in other frames to be averaged can be calculated by computing the angle of the weighted complex signal average along every A-scan: where m, j, and * denote the A-scan index, the B-scan index and the conjugate complex, respectively, while z 0 and z max are the axial depth limits. By subtracting the bulk phase shift The assumption that displacements are constant along the depth profile does however not hold for all samples. In particular live biological samples are subject to dynamic deformations, which may for instance be caused by pulsating blood vessels. Hence, in another complex image averaging approach, all complex signals of one reference B-scan serve as a 2D baseline image S (ref ) OCT (x, z) [18]. The phasors in all pixel locations (x, z) of the subsequent frames are then aligned with respect to those in the reference frame by multiplying them to the conjugate complex of S (ref ) OCT (x, z) in an amplitude weighted manner, Since Eq. (41) does not only align phasors at signal locations but also all noise phasors, complex averaging of S (j,moco) OCT (x, z) essentially corresponds to magnitude averaging. This detriment can be overcome by locally averaging the phase difference values ∆φ j (x, z) = φ j (x, z) − φ ref (x, z) prior to phase balancing and complex averaging. For this purpose, the phase differences ∆φ j (x, z) are first calculated between the j-th frame and the reference frame. In order to minimize the number of phase wraps in presence of strong axial motion differences, we choose the central frame as a reference. Second, the j-th phase difference B-scan is averaged using the respective phasors within a small kernel which may, for instance, be rectangular: Ideally, the kernel size (∆x, ∆z) is chosen large enough to include at least a few speckles but small enough to only cover a patch of similar axial motion. The PDF of the averaged phase difference ∆φ j (x, z) centered at zero can be derived from Eq. (2) by performing a variable transform to polar coordinates and then integrating the PDF over the amplitude space [47], whereby σ ∆φ = √ 2σ is used for the standard deviation: where erf(u) = 1 √ 2π ∫ u −∞ exp −v 2 /2 dv denotes the Gaussian error function. With increasing A/σ ∆φ , the distribution p ∆φ becomes more narrow and converges towards a δ-function at ∆φ = 0 [22,47]. For ∆φ 0, the maximum of the distribution is located at the respective expectancy value for the phase difference. In absence of a signal A, the somewhat complicated PDF p ∆φ reduces drastically and becomes a uniform distribution on (−π, π), Hence, the averaging process in Eq. (42) provides an estimate of the localized phase shift ∆φ with improved precision in case signal pixels are included within the averaging kernel (∆x, ∆z) but delivers a random ∆φ if the kernel only includes noise. If now, in a third step, the locally averaged phase difference map (Eq. (42)) is used to correct the phasors of the j-th frame with respect to the reference frame, the phasors will only be aligned for signals but will retain their random character for noise regions.
By averaging N such axial motion balanced frames, the full potential of complex averaging can be tapped without suffering from signal reduction due to local displacement differences.