Multi-MHz retinal OCT

: We analyze the benefits and problems of in vivo optical coherence tomography (OCT) imaging of the human retina at A-scan rates in excess of 1 MHz, using a 1050 nm Fourier-domain mode-locked (FDML) laser. Different scanning strategies enabled by MHz OCT line rates are investigated, and a simple multi-volume data processing approach is presented. In-vivo OCT of the human ocular fundus is performed at different axial scan rates of up to 6.7 MHz. High quality non-mydriatic retinal imaging over an ultra-wide field is achieved by a combination of several key improvements compared to previous setups. For the FDML laser, long coherence lengths and 72 nm wavelength tuning range are achieved using a chirped fiber Bragg grating in a laser cavity at 419.1 kHz fundamental tuning rate. Very large data sets can be acquired with sustained data transfer from the data acquisition card to host computer memory, enabling high-quality averaging of many frames and of multiple aligned data sets. Three imaging modes are investigated: Alignment and averaging of 24 data sets at 1.68 MHz axial line rate, ultra-dense transverse sampling at 3.35 MHz line rate, and dual-beam imaging with two laser spots on the retina at an effective line rate of 6.7 MHz.


Importance of acquisition speed in retinal OCT
Optical coherence tomography is a noninvasive, three-dimensional imaging modality [1]. OCT has found widespread use in many applications, especially in ophthalmology for crosssectional investigation of the human retina [2]. Over the last years, OCT imaging speed has been increased drastically by the introduction of Fourier-domain OCT (FD-OCT) [3]. Nowadays, the fastest research systems operate at axial line rates of several MHz [4]. In contrast, commercial ophthalmic systems usually operate at tens of kHz axial line rate, about two orders of magnitude slower. Since sample motion restricts the acquisition time for in-vivo imaging, imaging speed effectively determines the total number of A-scans in each data set. Hence, either sampling density or field of view is restricted. For retinal imaging, snapshot-like acquisition times of well below 1 s are necessary for a high probability that the 3D data sets do not contain microsaccades [5]. Therefore, commercial systems typically have a variety of acquisition modes in order to accomplish different goals. In "hi-res" modes, sampling density is increased and/or multiple frames are averaged in order to produce high-quality single Bframes. Frame averaging is especially important to reduce speckle noise, which obstructs fine anatomical detail [6,7]. In contrast, sampling density is typically reduced for acquisition of 3D data sets, which provide important diagnostic data, for instance thickness maps. It is usually not possible to have both the highest quality B-frames and high-resolution 3D capability in a single data set. This is why methods based on the field of compressive sensing are being investigated currently [8]. Due to the small field of view, 3D data sets are usually restricted to either the macula or the optical nerve head (ONH) area, and the operator needs to decide on the scanned region prior to data acquisition. Stitching or "mosaicing" of multiple data sets can be applied to increase the effective field of view, but this techniques increase acquisition time considerably [9][10][11]. Moreover, all the volumes need to be acquired before fusing them, which results in significant delay between capturing images and displaying large field of view. For these reasons, many systems employ an eye tracker, which corrects for involuntary eye motion and thus allows for an extended measurement time [12]. Eye trackers make the system more complex, since they need to incorporate an additional imaging mechanism with a frame rate which is higher than the OCT volume acquisition rate. Typically a scanning laser ophthalmoscope (SLO) with a separate beam steering mechanism is used, which makes it difficult to ensure pixel-to-pixel correlation between OCT and SLO data over the entire imaging volume. This issue may be overcome if most of the optical path is used for both the SLO and the OCT channel [13]. Moreover, since tracking accuracy is limited, it is usually not possible to generate high-resolution en-face images directly from the tracked OCT data set [14], although some research systems have shown higher tracking accuracy [15]. Hence usually only the SLO image is used for a transverse view of the retina, not the reconstructed OCT en-face image. However, SLO images represent only the total integrated intensity from all retinal layers, whereas OCT can generate depth-resolved en-face views with very high depth resolution [16]. These depth-resolved en-face views are a unique feature of OCT [17], and facilitate the visualization of many pathologies [18][19][20]. For instance, depth-resolved OCT allows the visualization and three-dimensional characterization of the choroid [21,22]. Due to the limited 3D data set size, these depth-resolved en-face views usually cover a smaller area than a typical SLO scan.
The problem of limited data set size can be overcome with MHz-OCT, i.e. with OCT systems capable of axial line rates in excess of 1 MHz. With MHz-OCT, a single data set can be acquired in short time that spans a very large field of view. These large data sets cover both the macula and ONH area, and enable ultrawide-field depth-resolved fundus views over an area comparable or even larger than covered by typical SLO scans [23]. Moreover, if the OCT imaging speed is increased up into the MHz and multi-MHz range, efficient speckle reduction by frame averaging can be used without loss of wide-field 3D and functional imaging capabilities, as will be shown in this paper. Other particularly promising applications of highspeed 3D OCT are the tomographic analysis of the posterior pole [24], volumetric adaptive optics OCT [9] and functional imaging [25][26][27][28][29]. For instance, the system configuration described in this paper has also been used to generate in-vivo single-shot wide-field microangiography images of the retina for the first time [30].

Pathways to higher speed in OCT
There are two approaches to achieve effective axial scan rates of more than one MHz: Increased single-beam scan-rate or parallelization of illumination and/or data acquisition. Increased single-beam scan rate may seem straightforward, and relies on faster swept lasers or higher spectrometer readout rates. However, for a variety of technical reasons, only a few systems have demonstrated speeds in excess of 200 kHz: The fastest single-camera spectral domain OCT (SD-OCT) system has been demonstrated by Potsaid et. al., running at up to 312.5 kHz line rate with reduced image quality [31]. More complex system architectures with parallelized data acquisition allow for a further increase in SD-OCT speed: With two spectrometers, axial line rates of 500 kHz have been demonstrated, with a ~3 dB sensitivity penalty compared to a single spectrometer [32]. The standard spectrometer design with a single camera can also be replaced with an optical demultiplexer and hundreds of single photodiodes. This constitutes an interesting approach, but currently suffers from very high cost and system complexity [33,34].
In swept source OCT (SS-OCT), also known as optical frequency-domain imaging (OFDI) [35], axial line rate is given by the sweep speed of the laser light source. In short cavity lasers, sweep rate is physically limited by the buildup time for laser activity [36]. Due to this limit, it seems unlikely that speed can be increased over about 200 kHz to 400 kHz with short cavity lasers [37,38]. Another concept, which might achieve higher OCT imaging rates in the future, is based on Vernier-tuned distributed Bragg reflector (VT-DBR) technology. A VT-DBR laser has only been used for OCT imaging at 1550nm with limited axial resolution [39]. Currently, the physical limit to sweep speed can be overcome in four (1) VCSELs can operate in a single longitudinal mode due to their short cavity length, which is on the size of about a few wavelengths. Hence, very long instantaneous coherence lengths can be achieved [44]. However, it should be pointed out that this ultra-long coherence length is not necessary for retinal imaging, since typically only ~2 mm depth range are necessary. Retinal imaging with a VCSEL has been performed at up to 580 kHz axial line rate [45].
(2) In Fourier domain mode-locked (FDML) lasers [41], the cavity round trip time is synchronized to the intracavity wavelength filter frequency. Therefore, the fundamental limit to laser tuning rate can be overcome, and single beam scan rates of up to 5.2 MHz were demonstrated at 1300 nm wavelength [4]. With chromatic dispersion management [46, 47], the coherence length of FDML lasers was enhanced considerably to more than 20 mm [48]. FDML lasers can also have high output power directly from the cavity, on the order of tens of milliwatts [4,23,49].
(3) Since a sweep corresponds to an extremely chirped pulse, very high sweep speeds can also be generated by temporally dispersing ultrashort pulses [42]. To date, record sweep speeds of 90.9 MHz were demonstrated, albeit only at a sensitivity of 42 dB, unsuitable for imaging of biological tissue [50].
(4) In wavelength swept ASE sources, light is repeatedly filtered and amplified by a series of filter and gain elements, which form no laser cavity [43]. Hence, the instantaneous linewidth is directly given by the number of consecutive filter events. Using two filter events, retinal imaging has been performed with a fast sweep rate of 340 kHz [51].
Due to the intrinsic limitations of all spectrometer solutions and other swept laser sources that have been used for OCT imaging, currently swept-source OCT with either VCSEL or FDML laser sources seems to be the most promising way for (multi-) MHz speeds in retinal OCT.

Challenges of MHz speeds
As discussed in the first section, increased speed offers a multitude of advantages over slower systems for retinal imaging. However, line rates in the MHz and multi-MHz range also raise a couple of challenges. First, the high speed imposes strict requirements on many parts of the OCT system, not only on the light source. Second, increased speed goes in hand with decreased sensitivity, which may limit the reasonable speed to only a few MHz maximum. We will briefly discuss both of these challenges for SS-OCT, starting with sensitivity.
For in-vivo retinal imaging of humans, power incident on the patient is limited by international standards, such that the sample arm power cannot be arbitrarily increased. Since shot noise limited sensitivity scales inversely with the optical power incident on the sample, sensitivity decreases with increased speed. For retinal imaging, the incident power level is limited by maximum permissible exposure (MPE) values of standards such as ANSI Z136.1, ISO 15004-2 or IEC 60825-1. For instance, ISO 15004-2 limits power of continuous nonscanned illumination incident on the cornea in typical OCT configurations to 1.4 mW at 1050 nm, while the MPE is slightly higher for the other standards.
With 1.4 mW on the cornea, even shot noise limited sensitivity falls below a reasonable mark of ~90 dB when coupling and interferometer losses are taken into account ( Fig. 1(a)). The calculation is based on the following formula for shot-noise limited sensitivity S [52]: is a typical detector responsivity at 1060 nm, P is the optical power incident on the sample, T is the duration of an A-scan (inverse A-scan rate), e is the elementary charge of the electron, and IL is loss on the way back from the sample to the detector. The formula assumes that apart from IL , all power returning from the sample is incident on the photodetector. For clinical imaging, the requirements on sensitivity may even be higher than in a laboratory setting, since the collected signal intensity may be severely lowered by media opacities and optical aberrations of the patient eye. Assuming a minimum required sensitivity of 95 dB and a 3 IL dB  in the setup, the maximum axial line rate is below 1 MHz for 1.4 dB sample arm power. Hence, for fast systems it is very important to assure a high collection efficiency, and a suitable interferometer design for sensitivity close to the shot noise limit [53]. Additionally, there are various pathways to acceptable signal strength: First, higher optical power levels may be applicable when beam scanning and shorter exposure times are taken into account. However, this requires additional system safety mechanisms to assure that power is always below MPE, even if the beam scanning system fails or exposure time is extended. Figure 1(b) shows sensitivity vs. speed for an optical power of 3.5 mW, which has already been applied for in-vivo human imaging at 1050 nm [54]. At this power level, multi-MHz retinal imaging with high sensitivity is feasible even in the presence of interferometer losses.
Second, effective sensitivity is increased by averaging. As already discussed, averaging is already commonly employed to reduce speckle noise. In fact, the perceived OCT image quality depends so critically on speckle reduction that, in the long term, it is likely that most retinal OCT systems will employ some type of averaging. Hence, the signal level is automatically increased, which readily alleviates the detrimental effects of higher speeds. Since SNR increases with the square root of the number of averaged frames N for incoherent averaging [55], even a moderate amount of 10x averaging will theoretically increase SNR by 5 dB. We will show later in this paper that high image quality can be achieved at multi-MHz speeds and low sample arm power, using different averaging schemes.
Third, optical power may be even further increased with multi-beam approaches, in which each beam illuminates a different sample region [56]. Hence, the effective line rate increases with the number of spatially separated beams, whereas the sensitivity for each beam remains constant. This multi-beam approach is especially promising for swept-source OCT, since the detection scheme can be more easily extended to a multi-channel setup than in spectral domain OCT. Multi-beam swept-source imaging has been demonstrated at 1300nm with 20 MHz line rate [4]. For retinal imaging, two beams have been used to achieve 400 kHz line rate in a SS-OCT setup at 1050 nm [37], and three beams at 840 nm for 210 kHz line rate [56].
Fourth, the amount of collected optical power also depends on the collection efficiency of the system. Due to optical aberrations, the collection efficiency reaches a maximum for rather small numerical apertures in retinal imaging. Adaptive optics OCT and joint-aperture OCT (JA-OCT) have been demonstrated to increase the collection efficiency of retinal OCT. JA-OCT additionally provides intrinsic speckle reduction while maintaining MHz line rates [57].
Besides these limitations considering OCT system sensitivity, the system requirements for (multi-) MHz OCT currently push the limits of the detection and the beam deflection systems. Fortunately, available electronic bandwidth, analog-to-digital (AD) sampling rate, and computational power have increased considerably over the last years, such that sustained acquisition of large data sets as demonstrated in this paper is possible. However, the mechanical specifications of the beam deflection system have not evolved accordingly. Standard galvanometer scanners restrict the frame rate to less than ~1 kHz, which is not enough to take full advantage of multi-MHz OCT systems. While resonant scanners provide high speed, they are not adjustable in terms of frame repetition rate, which is inconvenient for many applications. Both the sensitivity and beam steering issue may be overcome with parallel illumination. With line and full-field illumination, optical power on the sample can be increased considerably [5]. Very high speeds can be achieved, but crosstalk between pixels may lead to a severe degradation of image quality compared to standard confocal OCT. Maintaining confocal beam delivery, multi-beam OCT also reduces requirements on the beam deflection system.
Here, we present a combination of multi-MHz single spot line rates together with dualbeam OCT imaging, to yield line rates in excess of 6 MHz. Compared to previous results, better axial resolution, enhanced coherence length and novel data processing schemes will be shown.

Laser setup
As a light source, we used an 1050 nm FDML laser [49, 58,59], in a high-power Ytterbium co-amplified configuration [23]. The laser and buffer stage layout is shown in Fig. 2. Two key improvements were implemented compared to the laser presented in [23]: First, we used a bulk Fabry-Perot tunable filter (BFP-TF) instead of a fiber Fabry-Perot tunable filter (FFP-TF) [4]. The laser thus operates at a fundamental repetition rate of 419 kHz instead of 171 kHz, due to a higher filter resonance frequency. In contrast to most other swept lasers, this substantial increase in sweep speed leads to even more stable laser operation. We believe that this is mainly due to the shorter intracavity delay fiber, which has reduced absorption and polarization rotation [60]. Second, intracavity chromatic dispersion was compensated by a chirped fiber Bragg grating (cFBG, Teraxion Inc.), which has been shown to greatly improve the coherence length of FDML lasers [48,61]. After the laser cavity, the fundamental laser repetition rate is increased by time-interleaving, a technique also known as buffering [62]. To this end, the laser current was modulated to a 25% / 12.5% duty cycle for four-and eighttimes buffering, using a fast laser diode driver (Wieserlabs WL-LDC10D). Hence, the fundamental 419 kHz repetition rate is increased to 1.68 MHz and 3.35 MHz. The laser performance also benefited from a high-power semiconductor optical amplifier (Innolume GmbH) with large ~25 dB small signal gain, which was used both in the FDML laser cavity and as booster after the buffer stage. Since the SOA already has high gain, the ytterbium doped fiber was only very weakly pumped with ~40 mW optical power at 978 nm, which provided additional polarization independent gain, mainly at the lower wavelength end of the sweep around 1030 nm. Due to the combination of shorter fiber length, dispersion compensation and high gain, the fiber could be wound on a fiber spool of ~22 cm diameter, much smaller than the 28 inch bicycle rim used as a spool for the delay line in the previous setup [23]. Note that again standard single-mode fiber (SSMF, OFS Allwave ZWP) was used in the delay line, which reduces cost by a factor of ~50-100 compared to specialized fiber that offers true single-mode operation (for instance Corning HI 1060). SSMF supports more than one transverse mode around 1060 nm, but with proper splicing to single-mode fiber (HI 1060), the fundamental mode is excited almost exclusively. This is indirectly confirmed by insertion loss measurements on SSMF spools with various fiber lengths, which yielded an attenuation of 0.8 dB/km and splice insertion loss of 0.4 dB (for one transition of HI 1060 to SSMF, i.e. 0.8 dB splice loss for the entire spool).

Laser characterization
The laser spectrum and coherence properties were characterized directly after the cavity output, i.e. before the buffer stage (see Fig. 2), using an optical spectrum analyzer (OSA, Yokogawa AQ6370B), a motorized Mach-Zehnder interferometer (MZI, home built), a 1 GHz dual-balanced photodetector (BPD, Wieserlabs WL-BPD1GA), and a fast digital oscilloscope (Tektronix DPO 5104). Compared to our previous results using a 3 MHz FDML laser without dispersion compensation [63], we improved the sweep range by 53% (72nm vs. 47nm) and the coherence length by 266% (1.6mm vs. 0.6mm). Note that for the dispersion compensated laser the 6 dB drop occurs at frequencies that are already close to the bandwidth limit of the detection system; hence a detection system with more electronic bandwidth may even yield larger coherence lengths. Note that for long imaging depths, sensitivity decay is dominated by the detection system's 1 GHz electronic bandwidth.

OCT imaging setup, data acquisition and processing
The OCT imaging setup was designed to enable sustained multi-MHz single-beam and dualbeam retinal imaging over a wide field of view. These requirements can be divided into two groups: First, the optical system in the sample arm has to be designed for wide-field retinal imaging. Since alignment is critical in wide-field retinal imaging, a fixation target and visual alignment control should be available. Second, the detection system needs to offer enough bandwidth (analog electronic and digital sampling rate), while at the same time allowing detection close to shot noise. As will be shown later, it is also beneficial to provide a means for sustained acquisition of very large data sets, on the order of tens of gigabytes. The basic interferometer layout and optical design has been adapted from [23] (Fig. 4), with improvements already used in [30]. The fiber output was collimated with an aspheric lens of 11 mm focal length (Thorlabs C220TME-C). A combination of 8 spherical two-inch lenses, arranged in two groups (L1 and L2), was used to relay the scanner pivot point into the human pupil, where the beam had a diameter of approximately 1.1 mm (1/e 2 intensity). A three inch dichroic mirror (Layertec GmbH, reflectivity > 99.9% from 1000 nm to 1090 nm) placed between the two relay lens elements coupled a fixation target and a camera for pupil alignment into the sample arm. For wide-field retinal imaging, especially in the non-mydriatic eye, it is important that the incoming OCT beam passes through the center of the pupil to reduce tilt of the retina in the OCT images. Moreover, we experienced that, for our setup, best image quality can be achieved when the pivot point of the imaging beam is located approximately at the nodal point of the eye, close to the posterior surface of the crystalline lens. Axial alignment to the nodal point also reduces retinal curvature in the OCT images [64]. It should be emphasized that this alignment strategy is different to the common approach, where the axial position of the pivot point is placed close to the pupil, in order to avoid clipping of the extreme rays [64,65].
The single-beam interferometer was duplexed for two-beam imaging, similar to previous multi-beam approaches. Two essentially identical interferometers are used for the two channels. A completely symmetric fiber interferometer layout configuration assures that detection is spectrally balanced, without the need to digitize the two photodiode outputs separately [66]. A recalibration curve for time to k-space resampling was acquired prior to the acquisition of each data set, using the built-in recalibration arm in the interferometer. Sample and recalibration arms were shuttered automatically by a servo motor when the recalibration curve was recorded. Contrary to short cavity lasers, no k-clocking or simultaneous acquisition of reference fringes is necessary [67], since the FDML laser has low sweep-to-sweep variation of the time-wavelength evolution. The same recalibration curve was used for the entire data set. Note that recalibration was carried out with the recalibration arm of only one of the two channels, but for each buffered sweep separately. The same recalibration data was used for the second channel.
The OCT signal was detected with a 1GHz dual-balanced photodiode (Wieserlabs WL-BPD1GA), low-pass filtered (BLP-600 + , Minicircuits) and digitized by an 8 bit PCIe data acquisition card at 1.5 GS/s (A/D card, Signatec PX1500-4). At this sampling rate, the A/D card enabled sustained data transfer to host computer (PC) memory with 85% duty cycle. Hence, our data set size is no longer limited by the onboard memory of the digitizer, which is usually much more expensive than PC RAM. In our case, more than 44 GB of data can be acquired using 48 GB of PC memory. After data acquisition, the OCT data set was stored on the computer hard drive and processed using standard OCT signal processing, including numerical dispersion compensation. Due to the ultrahigh frame rate, the fast galvanometer scanner was driven with a sinusoidal signal. Both scan directions were used with an 85% duty cycle, and images were corrected for nonlinear scanning via linear interpolation in postprocessing [4,63]. In our processing chain it appeared advantageous to carry out interpolation on the logarithmized OCT intensity images to minimize interpolation artifacts caused by noise. However, the physically correct way to average sample reflectivity is to average OCT signal intensity, i.e. the squared OCT signal amplitude. Hence, we performed a multi-step procedure for averaging multiple B-frames: First, linear intensity (squared amplitude) values from each scan direction were averaged before logarithmization and interpolation. Subsequently, the interpolated logarithmic images from both scan directions were averaged.
Cut-levels were used to restrict the dynamic range to 25 dB in the logarithmic images, with the lower cut-level automatically set 2 dB lower than the mean noise floor for each data set.

Overview
In-vivo non-mydriatic retinal imaging was performed on a healthy volunteer in accordance with the tenets of the Declaration of Helsinki. To assure comparability of the different imaging speeds and protocols, the same eye of the same subject was imaged for all data sets presented in this paper.  At 1.68 MHz laser sweep rate, ultra wide-field fundus coverage with dense isotropic sampling is possible with acquisition times of less than a second. Here, we acquired more than a million axial scans in 0.83 s, including galvanometer dead time. This short acquisition time already yields undistorted data sets with high probability. Moreover, even the slow galvanometer axis moves with an angular velocity of about 60 °/s, on the order of the typical saccade speed [68]. This means that distortions by saccades and other involuntary patient movements are spread out over large areas. Hence more B-frames are affected, and the relative displacement between successive B-frames is reduced, leading to a smoother transition between undistorted and distorted areas. Figure 5 shows an en-face view of a data set consisting of 1088 2 A-scans, and a representative, unaveraged B-frame. The total imaging depth of 3.4 mm in air is sufficient for wide-field imaging, covering the macula, the optical nerve head (ONH) and parts of the periphery. Apart from one horizontal line in the middle of the B-frame, no artifacts are visible in the image. Sensitivity was measured to be 91 dB with a neutral density filter and a test mirror, with 1.4 mW sample arm power. Taking into account 3 dB loss in the interferometer and 65% coupling efficiency, this measured sensitivity value is shot-noise limited. We chose a power of 1.4 mW to allow for uninterrupted sequential acquisition of many data sets. As already discussed, this acquisition of tens of gigabytes of data was enabled by data sustained data transfer from the A/D card to host computer memory. With a total of 24 acquired data sets, new averaging protocols can be applied to reduce the impact of involuntary eye motion and to reduce speckle noise.

The simple "acquire-align-average" (AAA) processing approach
There have been various publications about averaging of several OCT volumes [15,67,[69][70][71]. Most of them are either complex, or slow, or require a tracking system during data acquisition. Here we demonstrate a simple and very fast approach, which still shows remarkably good results. Our approach has some similarities to automatic volume-stitching (mosaicing) methods. However, instead of acquiring many volumes from different locations of the retina, the same volume is repeatedly scanned to form a series of data sets. Afterwards, these data sets are automatically aligned to each other and averaged, an approach that we call "acquire-align-average (AAA)", which produces single highly averaged images (Fig. 6).
Since image distortions are randomly spread over all data sets, distortions will be statistically removed from the average of all data sets. In AAA, transverse alignment based on the OCT en-face projections is performed first. Second, B-frames from the same location in each of these transversally aligned data sets are aligned to each other and averaged. This scheme yields highly averaged images for both en-face views and B-frames. It is conceptually similar to extracting motion information from an SLO channel in post-processingsince the OCT volume rate is fast enough, the OCT en-face image can be used instead of the SLO image. The AAA approach does obviously not produce a single corrected 3D data set, since axial shift will still be observed in the direction of the slow scan axis. However, in almost all cases (depth-resolved) en-face views and B-frames are the only views used anyway. Moreover, the additional 3D information is mainly used to generate thickness maps, which should not be affected by the improper 3D alignment along the slow axis. Thus, all information that is used in current clinical retinal OCT systems will benefit from the averaging-based signal improvement in AAA. In future, full 3D correction may be carried out, for instance by using a reference B-scan along the slow axis [64]. Other methods, which have been used for mosaicing, may be employed as well [10,[72][73][74], but without the need for special scan patterns [69,70]. After acquisition of N data sets, the data sets are aligned and averaged. Thus, distortion per frame is "averaged out" with increasing N. b) Result of AAA approach to imaging at 1.68 MHz. 4 frames from each of the 24 data sets were averaged (i.e. a total of 96 frames) to yield strong speckle reduction. c) Enlarged region from the region indicated by the white frame in b). All retinal layers including the ELM are clearly visible, indicating that alignment worked well. Note that image displayed in the right column of a) is the averaged en-face image after AAA processing of the 24 en-face images. Media 2 shows the en-face reconstructions of all acquired data sets before registration at a playback speed of 2 volumes per second. Slight "zipper" artifacts can be observed due to instabilities in the galvanometer scanners operating at very high speed. A blinking artifact is clearly visible in the first data set.
After acquisition of all data sets, transverse alignment was performed using the reconstructed en-face images. Here, registration was performed with the StackReg plug-in in FiJi [75,76], using the rigid body model algorithm that allows translation and rotation only. After registration, the transverse displacement information from the en-face image should be applied to the entire 3D data sets. In our case, the data sets showed only little translation and rotation with respect to each other. Therefore, it was sufficient to simply take B-frames at the same location in each uncorrected data set, and aligning them to each other independently from the transverse information from the en-face alignment process. This constitutes an even simpler approach than the full AAA procedure described above. Figures 6(b), 6(c) show the resulting averaged B-frame, at the same location as indicated in Fig. 5. In fact, the data presented in Fig. 5 was taken from one of the 24 data sets. A total of 96 B-frames were averaged: 4 adjacent B-frames were averaged in each of the 24 data sets. Despite the fact that only basic alignment as described above was applied, the averaged frames still show details from all retinal layers, including the external limiting membrane (ELM) and blood vessels in both the retina and the choroid. Additionally, the resulting strong speckle reduction enables clear differentiation of layers up to the choroidal-scleral interface. Also note that the AAA approach not only reduces the impact of involuntary eye movements: Since the data sets are in general slightly shifted with respect to each other, the intensity of image artifacts like the fixed-noise pattern horizontal line visible in Fig. 5(b) is also reduced. Moreover, the eye-blink artifact observed in one of the data sets (see Media 2) is also suppressed in the averaged en-face image.
So surprisingly, MHz OCT enables fast averaging of frames from multi-volumes even without the information from transverse alignment. The AAA approach exhibits high resolution and low speckle noise without the need of a hardware eye tracker and without the need of very accurate alignment.

3.35 MHz imagingvery densely sampled data sets
At 3.35 MHz, the measured sensitivity dropped to 88 dB -89 dB, again at the shot noise limit with a sample arm power of 1.7 mW. Reduced signal compared to the images in the previous section is visible (Fig. 7). For two reasons a smaller 40-45° field of view was chosen: . As can be seen in c), field of view is not limited by sample arm optics or laser coherence length, but by the sampling rate of the A/D card (and the resulting small imaging range). Media 3 shows only every fifth frame of the data set to reduce the size of the movie. For the movie, frame rate was set to 50/s and aspect ratio was adjusted by 2x downsampling in the horizontal direction. No further averaging was applied. First, the galvanometer scanners did not work reliably at larger scan angles at these speeds anymore. Second, the imaging range of our system was reduced to 1.7 mm due to the finite sampling range and electronic bandwidth. Due to the natural curvature of the fundus, the retina moved out of the imaging range at the periphery, such that field of view was limited by A/D sampling rate (see Media 3).
To illustrate how low signal strength can be compensated by averaging, an extremely dense sampling pattern was applied, with 1900 A-scans per B-frame and 9500 B-frames per volume acquired in 6.3 s. While this acquisition time is too long for clinical application, it enables the controlled investigation of averaging very large amounts of adjacent B-frames. Figure 8 shows the average of 24 and 48 B-frames taken around the location of the B-frame in Fig. 7(b). Note that in this case, no alignment was necessary for averaging. Two points are noteworthy: First, signal intensity is increased to a level where all fundus layers up to the choroidal-scleral interface can be clearly identified. Moreover, fine structures such as retinal capillaries can be readily observed. Second, the B-frame sampling density is so high, that the speckle pattern is still partially visible despite the very large amount of up to 48 averaged frames. This may not be surprising, since the distance between neighboring B-frames is only about 1.3 µm (using a scaling factor of 288 µm/degree, and 9500 B-frames over 42°). Hence, the 24 averaged frames span a distance of only slightly more than the spot size on the retina. Since the speckle pattern should remain fairly stable within a spot size, the speckle pattern is still visible even with 24x averaging. While speckle reduction is more pronounced with 48x averaging, the larger area covered already causes blurs in steep anatomical structures such as in the optical nerve head.
Furthermore, new anatomical features such as vitreoretinal adhesion are suddenly visible in the 48x averaged frame (see red circle in Fig. 8(b)). As already noted in [57], the optimum interframe distance for speckle reduction (see for instance [77]) may be too large to keep sufficient transverse resolution. So in summary, the low image quality of 3.35 MHz OCT can be highly improved by averaging. A value of 24 averaged frames yield good image quality especially considering the fact, that 24x averaging of a 3.35 MHz OCT system still results in an effective 140 kHz A-scan rate system.

6.7 MHz dual-spot imaging: FDML without dispersion compensation
As a proof-of principle experiment, we performed dual-spot imaging at 2x 3.35 MHz axial line rate with an FDML laser without dispersion compensation. The lower laser performance compared to the dispersion compensated laser highlights the importance of a well designed laser for swept-source multi-MHz imaging. The reduced axial resolution and shorter sensitivity roll-off decay are clearly visible in the resulting OCT images (Fig. 9). This is also due to the fact that we only used 0.85 mW sample arm optical power for each of the two beams, i.e. the same combined power as for single-beam imaging. Nevertheless, a moderate amount of 6x averaging increases the image quality considerably. Fig. 9. Two-beam imaging at an effective axial line rate of 6.7 MHz, using only 0.85 mW optical power for each beam. a) Reconstructed en-face view: White dotted outline shows the area scanned by beam 1. There is around 10% overlap between the areas scanned by the two beams. b,c) B-frames from beams 1 and 2, as indicated by the red bars in the en-face view. The frames were 2x decimated in the transverse direction. d) 6x B-frame average at the same position as the frame shown in c), with enlarged region e). Effective axial line rate with 6x averaging is 1.1 MHz, still higher than the raw speed demonstrated with any non-FMDL light source. The en-face image also shows ciliary shadowing, a typical problem encountered in wide-field retinal imaging [78].

Summary and outlook
We presented in-vivo OCT of the posterior pole at axial line rates of multiple MHz for the first time. At these ultrahigh speeds, image quality is affected by several challenges, such as limited sensitivity and imaging range. We have demonstrated three ways to overcome some of these issues: First, already moderate MHz speeds are sufficient for successive acquisition of many data sets. With the resulting large amount of acquired data, high image quality can be achieved. Here, 24 data sets have been acquired in less than a second, with low probability of distortion in each single data set. Hence, averaging reduces distortions and image artifacts are reduced, which are spread out statistically over all data sets. The presented alignment and averaging scheme corresponds to a software based eye-tracking solution. Hence system complexity is considerably reduced compared to commonly used hardware (SLO) based tracking schemes, since no further hardware is required. Second, we have shown that even a further increase to multi-MHz speeds is feasible even with relatively low optical power levels. Sufficient image quality can be obtained simply by averaging a correspondingly larger amount of frames. It might be argued that higher speeds make no sense if the necessary number of averaged increases accordingly, lowering effective line rate again. However, frame averaging will usually be performed anyway for a different reason, namely speckle reduction. We believe that perceived image quality of a fast system with 10x frame averaging will be higher than the image quality of a 10x slower system without averaging, assuming that the initial sensitivity is not too low. Suitable sensitivity values larger than ~90 dB should be easily achievable in ophthalmic multi-MHz OCT, by using higher MPE values for scanned systems. At the same time, even with averaging, the effective line rates in multi-MHz OCT exceed the line rates of commercial systems. For instance, 24x averaging at 3.35 MHz, as demonstrated in this paper, still yields an effective line rate of 140 kHz. Hence, very fast OCT systems are able to perform frame averaging for speckle reduction without reduction of 3D capabilities. Third, multi-beam imaging may be a solution for further increase in speed, especially if the total sample arm power is increased. With faster resonant beam scanners, the demonstrated speed of 6.7 MHz is sufficient for video-rate 3D OCT with reasonably dense sampling. For instance, a volume of 477 x 477 A-scans could be acquired with a volume rate of 25 Hz, using only 85% scanner duty cycle. Real-time video rate ophthalmic 3D OCT may be useful in applications such as OCT for surgical guidance, which until now is restricted to 2D live display or small 3D sampling densities only [79,80].
Our results also show that the imaging range of ~3-4 mm (in air) is sufficient for ultrawide-field fundus coverage in normal subjects, in order to deal with the image curvature of the retina over the wide OCT viewing angle. Here, our imaging range was limited by the electronic bandwidth and sampling rate of the detection system. The next generation of ultrafast A/D cards will likely allow for sustained data acquisition at sufficient electronic bandwidth and more than the demonstrated 1.5 GS/s, thus opening the way for ultrawide-field fundus coverage and multi-volume averaging at multi-MHz speeds.