Improving lateral resolution and image quality of optical coherence tomography by the multiframe superresolution technique for 3D tissue imaging

: The multi-frame superresolution technique is introduced to significantly improve the lateral resolution and image quality of spectral domain optical coherence tomography (SD-OCT). Using several sets of low resolution C-scan 3D images with lateral sub-spot-spacing shifts on different sets, the multi-frame superresolution processing of these sets at each depth layer reconstructs a higher resolution and quality lateral image. Layer by layer processing yields an overall high lateral resolution and quality 3D image. In theory, the superresolution processing including deconvolution can solve the diffraction limit, lateral scan density and background noise problems together. In experiment, the improved lateral resolution by ~3 times reaching 7.81 µm and 2.19 µm using sample arm optics of 0.015 and 0.05 numerical aperture respectively as well as doubling the image quality has been confirmed by imaging a known resolution test target. Improved lateral resolution on in vitro skin C-scan images has been demonstrated. For in vivo 3D SD-OCT imaging of human skin, fingerprint and retina layer, we used the multi-modal volume registration method to effectively estimate the lateral image shifts among different C-scans due to random minor unintended live body motion. Further processing of these images generated high lateral resolution 3D images as well as high quality B-scan images of these in vivo tissues.

Spectral domain OCT (SD-OCT) imaging has two distinct resolutions namely axial resolution in the image depth direction and lateral resolution in the transverse image plane. The axial resolution can be improved by reducing the coherence length of the measurement light beam using supercontinuum broad bandwidth light source [19] or extended broadband superluminescent diode (SLD) [20]. The lateral resolution is restricted by optical diffraction limit [21], lateral sampling rate [22] and background noise [23]. The diffraction limit is determined by the system numerical aperture ( NA ). Although using a high NA optics with smaller focused beam spot size on the sample can improve the SD-OCT lateral image resolution, there is an obvious trade-off of subsurface imaging depth in the tissue sample owing to the depth of focus ( DOF ) limitation, losing its main advantage over confocal microscope. Also high NA system leads to a smaller lateral field of view (FOV). Therefore, improving the lateral resolution without sacrificing DOF and FOV is a very important research objective.
To improve the lateral resolution of SD-OCT, adaptive optics (AO) correcting aberration wave front has been adopted by researchers [24]. With high cost and very limited FOV (maximum 1 × 1 mm 2 ) [25,26], AO technique in principle is to recover the true resolution of OCT blurred by human eyes. Thus, this technique is often applied in ophthalmic research, not suitable for other applications like skins due to scattering blurring. A virtually structured detection (VSD) method [27] was reported that added an electro-optic phase modulator in the reference arm to shift the light phase with multiple of / 2 π , and then fused four phase shifted A-scans by VSD algorithm to achieve resolution doubling. It is however time consuming (taking ~40 s for each image frame) and is infeasible for in vivo imaging. A digital refocusing approach [28] was introduced to improve the lateral resolution of out-of-focus plane which may not improve the lateral resolution on the focus plane. M. Dirk Robinson et al. [29] registered four sparse scanned summed voxel projection (SVP) of retina images and then rearranged them and reconstructed a higher density SVP image in y -axis to improve the resolution and reduce motion errors. The fast processing limited the resolution improvement of the final SVP image in only one coordinate axis and also the quality improvement compared with the traditional high density scan method. Digital image deconvolution processing has recently been widely applied for resolution improvement [30][31][32][33]. The estimation of the ground true lateral point spread function (PSF) of the system is however very difficult and the actual PSF may be different in different tissue samples and at different sample depth layers.
Background noise, including white and speckle noises, is another important factor which disturbs and hides structural details, degrading the resolution and quality of OCT images. White noise attributed to random signal noise can be effectively suppressed by multi-frame averaging [34,35]. The more the image frames, the lower the white noise floor. Differently, speckle noise is a kind of pattern noise corresponding to the sample structure. Simple multiframe averaging cannot completely remove such structural signal related noise. Maciej Szkulmowski et al. [23] introduced an interesting shifted frames averaging method to suppress the speckle noise in B-scan. Through averaging multiple B-scans with shifts in xand y-directions, the final output B-scan image exhibits speckle reduction and overall image quality improvement. However, this approach appears difficult to avoid ghost patterns from different frames such as multiple fingertip patterns appearing in the final output image. Also, simple averaging shifted images without shift compensation may penalize the high frequency signals, degrading the resolution. Besides, longer scan time is impractical for in vivo 3D imaging.
Lateral sampling rate [22] in scan based OCT imaging is termed the scan matrix density. The higher the scan matrix density (within a certain density range), the better the lateral resolution performance. Increasing the scan matrix density is at the expense of longer scan time and is thus not suitable for time sensitive scan imaging applications, such as in vivo imaging of eye or skin. Besides, high scan density cannot overcome the diffraction limited optical resolution restriction and reduce noises.
In this paper, we introduce an effective multi-frame superresolution technique to significantly improve the lateral resolution and image quality of SD-OCT with less scan time and without adoption of extra hardware and higher NA optics. Through adjustment of x and y galvanometer scanners to introduce sub-spot spacing shifts among different low-resolution Cscans, superresolution processing is then applied to generate a higher lateral resolution 3D in vitro image with lower background noise. Image registration method is used to estimate the unknown random in vivo sample shifts among different C-scans. The subsequent superresolution processing demonstrates improved lateral resolutions as well as image quality of 3D in vivo images of skins, fingerprints and retina layers.

Design and principle
2.1 SD-OCT and lateral resolution limit SD-OCT system (BIOptoscan OS-186, New Span Opto-Technology), as schematically shown in Fig. 1, sends a broadband light from the SLD to a 2 × 2 optical fiber coupler. The SLD has a center wavelength λ o of 860 nm and spectral bandwidth δλ of 100 nm (IPSDW0822-0314, InPhenix). One split beam is sent to the reference arm that is collimated and focused to a mirror before it is reflected back to the fiber coupler. The other split beam in the sample arm is collimated and is laterally scanned by a pair of galvanometer scanner mirrors and focused to the measurement sample. The scattering signals from different depth layers of the sample collected by lens and scanners are sent back to the fiber coupler to interfere with the reference arm return beam and generate spectral dependent interference signals that are separated by the grating and acquired by the CCD line scan camera of the optical spectrometer for computer signal processing. Each scattering depth would result in a near sinusoidal interference pattern in the frequency domain. A complex signal pattern is obtained due to mixing of scattering signals from all sample depth layers. A fast Fourier transform processing of the complex interference pattern in the frequency domain can beautifully retrieve scattering information on all sample depth layers within the depth range of the SD-OCT set by the combination of sample arm optics DOF and the interference pattern distinction ability of the optical spectrometer. This yields the A-scan depth intensity profile ( ) I z . Through galvanometer scanning in the transverse x axis, we obtain the B-scan xz intensity image ( , ) I x z . By galvanometer scanning in the transverse x and y axes line by line, we have the C-scan xyz intensity 3D image ( , , ) I x y z .
As mentioned earlier, the focused beam spot size at full width at half maximum (FWHM) of the SD-OCT is greatly limited by the NA of the sample arm optics [21] as where NA is the ratio of the input collimated beam radius to the focal length of the sample arm lens in the air. The axial DOF is determined by [21] ( ) Given our collimated beam diameter of ~3 mm, the corresponding focusing spot size and axial depth range are listed in Table 1 for a few common lens focal lengths. Here, before reaching spectrometer limitation, the axial depth range is set by the axial DOF . Clearly, improving the lateral resolution with smaller focused beam spot size would be at the expense of reducing the DOF , or overall SD-OCT imaging axial depth range. Although using a 10 mm focal length lens would provide a best beam spot size of 2.12 µm, its poorer axial depth range of 0.086 mm would make the SD-OCT incompetent to the confocal microscopy which maximally has a depth range of about 0.2 mm [3]. Using a 100 mm focal length lens, the calculated focused beam spot size is 21.21 µm which closely matches with the measured beam spot size using a laser beam profiler (BP-5.0, New Span Opto-Technology). Although this focused beam can offer the long depth range of 2.86 mm set by the spectrometer, its relatively large focused spot size does not provide desirable lateral resolution. Therefore, without hardware improvement to overcome diffraction limitation, the lateral resolution improvement of OCT should consider image processing methods. If we can improve the lateral resolution to several µm with very good image quality and maintain the depth range of 2.86 mm, the processing based lateral resolution improvement technique should benefit various biomedical tissue imaging applications. Using a 30 mm focal length lens, our goal is to improve the lateral resolution to ~2 µm to approach that of confocal microscope which could benefit fine tissue structural imaging.

Improving lateral resolution by high density scanning
In order to scan a large area, the focused beam spot scan matrix, or C-scan, is usually set to scan one by one without positional overlapping, as illustrated in Fig. 2(b), similar to spatially separated camera pixels. The presence of spot spacing results in loss of spatial image features, or called undersampling. Without demanding smaller focused beam spot size for preserving a long DOF, a high density C-scan with partial scan beam spot overlapping could improve the SD-OCT lateral resolution to some extends, but at the expense of reducing the FOV, as illustrated in Fig. 2(c). In principle, we could overcome the FOV lost by high density scan imaging of several nearby areas and then applying the image stitching technique [36] to reconstruct a large areal image. The required motorized stage for stable sample arm moving and 30~50% area overlapping for images registration and fusion would make the image stitching processing difficult and time consuming which is unsuitable for in vivo imaging of live tissues with unknown tissue motion and vibration. Besides the FOV reduction, this highdensity scanning method has its resolution limitation to be discussed later in the experiment Sec. 3.1. In this paper, if without specific explanation, the low density scanning means scan spots are one by one close space spot array without overlapping. The high density scanning means adjacent scan spots are partially overlapped. For example, 4 times high density scanning means each scan spot have half spot overlapping in each x and y direction with four neighboring ones (top, bottom, left and right). Fig. 2. Example images of a high resolution target (a) and under a low density scan imaging (b). (c) 4-time high density scan imaging with the same scan matrix size as (b) but with half spot overlapping with adjacent ones yields high resolution image with a reduced FOV. Here, we apply four colors to show overlapping of adjacent scan spots.
In order to avoid FOV reduction, a larger C-scan matrix may be applied to the sample sacrificing the scan time, which is unacceptable for time sensitive in vivo imaging of live tissues due to tissue motion and vibration. To demonstrate the time dependent live tissue motion, we performed 100 repeated sets of 128-spot B-scans without scan spot overlapping to image a human skin. The 100 sets of such B-scan completed within 0.55 s and the fast Fourier transform OCT image processing was performed after the 100 B-scan acquisitions. The comparison of the 1st and the 100th B-scan images shows no observable lateral image shifting as shown in Fig. 3(a), demonstrating that a 100 × 128 C-scan is fast enough (we have further verified that 128 × 128 C-scan is also fast enough) and is suitable for typical live tissue in vivo SD-OCT imaging without the concern of skin tissue motion in one C-scan. On the other hand, we performed 100 repeated sets of 512-spot B-scans with the same FOV as above at the same line scan rate of 24 kHz and compared the 1st and the 100th images as shown in Fig. 3(b). Again, the fast Fourier transform processing was performed after the 100 B-scans acquisition. We observed obvious image positional shifts during the 100 × 512 scan imaging period indicating some tissue motion during this period. These two tests were repeated several times with similar results. We note that the scanner optics was held steady during the course of the scan image acquisition. This indicates that 100 × 512 scan density, not so high density, is already inadequate for reliable in vivo imaging of live skin tissues owing to the concern of live body motion and vibration. Needless to say, in vivo tissue image misalignments are expected in 512 × 512 or 1024 × 1024 C-scans due to longer C-scan time. For retina C-scan imaging, unintended eye quick motions are harder to avoid than skin motions. The eye motion can result in image artifacts and misalignments as indicated at the yellow arrow line in Fig. 3(c). Several motion correction methods may be used to partially solve this problem [37]. However, the best way for in vivo sample imaging is to scan as fast as possible, avoiding any image artifacts and errors in a single 3D image C-scan set. In our experiment, a 128 × 128 C-scan with 0.7 second acquisition time can effectively prevent obvious motion errors, guaranteeing the data reliability in one C-scan. For unavoidable image shifts among different C-scan sets, an image registration method will be used. Fig. 3. We performed in vivo SD-OCT B-scan on a live human skin with 100 repeated times at (a) 128-spot density and (b) 512-spot density. The FOV is the same. We then overlapped the 1st and the 100th images with green and magenta colors to compare the differences between them. (a) Due to the fast 128-spot B-scan, the horizontal positions of the hair (in the yellow and green squares of (a)) in the 1st and the 100th images remain almost unchanged. (b) There are obvious differences between the two images due to slower 512-spot B-scan that is found more sensitive to tissue motion. (c) The yellow arrow indicates motion artifact line caused by eye motion that lead to discontinuities of the two micro-vessels. 100 mm focal length lens was used in these experiments. The scale bars in (a), (b), and (c) are 500 µm.

Improving lateral resolution by multi-frame superresolution for in vitro imaging
Multi-frame superresolution is a kind of image processing technique which studies image degradation models (such as optical blur effect, motion effect, down-sampling effect, and additive noise) and then recover the high resolution images from multiple low quality images based on the superresolution model used, overcoming the resolution limit of the hardware. Fig. 4 illustrates how these effects together result in a lower quality image during conventional camera image acquisition. To recover the high resolution image, we can reverse the image degradation process using techniques such as up-sampling, motion/pixel shift estimation and compensation, deconvolution, and denoising. In a SD-OCT system, the main degradations of an ideal lateral image ( , ) S x y extracted from a depth layer of a C-scan 3D image are from the optical blurring ( , ) H x y caused by the response of the OCT imaging system (such as lateral PSF) and the down-sampling ↓ due to the relatively large focused beam spot size on the sample and low scan matrix density. Each isolated focused beam spot is treated as a pixel in conventional camera imaging. The motion effect can be generally ignored when imaging in vitro samples [11,16] since both the sample arm and the tissue are held steady. The motion effect should be considered for imaging of in vivo tissue due to potential live body motions and vibrations. According to the superresolution principle [38][39][40], the resolution improvement comes from the effective sub-pixel differences among multi-frame low resolution images, as illustrated in Fig. 5. Without introducing sub-pixel shifts among images, the stationary multi-frame imaging and processing would mainly contribute to the minimization of time dependent white noises [34]. For SD-OCT superresolution, the conventional sub-pixel shift now called sub-spot-spacing shift is needed.
Applying multi-frame superresolution technique to SD-OCT imaging, Fig. 5(a) illustrates schematically a 5 × 5 pixel image in one depth layer of a C-scan 3D image with no scan beam spot overlapping. Here, each pixel is a scan beam spot position on the sample. In order to satisfy multi-frame superresolution processing requirement, we intentionally introduce slight differences in scanner control voltage matrix to create several sets of C-scans with sub-spotspacing shifts (equivalent to sub-pixel shifts) in the xy plane, as schematically shown in Fig. 5(b), 5(c), and 5(d). The superresolution processing of the four images in the same depth layer but from different C-scans generates a higher resolution larger pixel count image of that depth layer as schematically shown in Fig. 5(e).
Here we note that the image resolution covers two points: one is pixel resolution which is equivalent to dots per inch in conventional terminology and the other is spatial resolution which is defined as the smallest discernible detail in an image [41]. Figure 5(e) is obviously a result of lateral pixel resolution improvement leading to lateral spatial resolution improvement. Although simply reducing the sub-spot-spacing shifts and increasing nonidentical image frames for multi-frame superresolution processing can significantly improve the lateral pixel resolution, there is however a spatial resolution limit owing to optical diffraction limit, system noises, stability of interference pattern and so on. In other words, when the scan matrix density is high enough, further increment would have no obvious help to lateral resolution improvement. Thus, finding the relationship among lateral resolution improvement, sub-spot-spacing shifts, and number of image frames would help identify a suitable resolution improvement technique without demanding an excessive number of shifted image frames, which in term would avoid the acquisition of unnecessary image frames and associated excess acquisition time. As illustrated in Fig. 5, these C-scans have multi-directional sub-spot-spacing shifts. For easier explanation, the shifts in Fig. 6(b) is a simplified illustration of the shifts in Fig. 5(a)-(d), using four blocks to show the shifts directions and shift amount relative to the first nonshift C-scan labeled as 0. Here, 0.5 spot means shifting 0.5 spot spacing in the corresponding direction. Strictly speaking, the three shifts should be represented as (0.5, 0), (0, 0.5), and (0.5, 0.5) in the x and y coordinates for superresolution processing in mathematics. Compared with the traditional four-frame shift strategy in Fig. 6(b), we found that more shifts (gray shifts) in Fig. 6(c) and 6(d) in addition to red shifts can lead to much better image quality and higher spatial resolution as experimentally verified. The introduction of these additional shifts can effectively provide more information for superresolution processing and suppress the background noises of the OCT system. By using 1/4-spot-spacing shift as in Fig. 6(d) red points, the superresolution technique can improve the lateral pixel resolution by about 16 times in principle. Using 1/8-spot-spacing shift C-scans (not shown) it can improve the lateral pixel resolution by about 64 times. In order to simplify the explanation of the shift strategy later in the paper, we call the Fig. 6(c) as 1/2-spot-spacing shift step and maximum 1/2-spotshift, and Fig. 6(d) as 1/4-spot-spacing shift step and maximum 3/4-spot-shift, and so on. The superresolution processing of multiple low resolution image frames with sub-spotspacing shifts can help recover the ideal high quality lateral image ( , ) S x y . Considering a pure translational motion model with space invariant blur and additional noise as [ , ] V x y , one of the acquired low resolution lateral image [ , ] I x y at a selected depth layer is mathematically modelled as Here, F is the motion operator due to sub-spot-spacing shifts discussed above. ( ) , H x y is the PSF of the system blurring the image. ⊗ is the convolution operator. ↓ is the discretizing down-sampling operator.
We apply a blurring-warping model for OCT image superresolution processing which may be different from warping-blurring model [39,42] commonly used for camera superresolution applications. For SD-OCT scan imaging, a C-scan 3D image includes multiple lateral images at different sample depth layers. These lateral images of interest are obtained through many A-scans where optical blurring occurs first followed by potential spatial shifts in C-scans (purposely introduced sub-spot-spacing shifts or unintended live tissue motion). The fast A-scan takes 36 µs or shorter exposure time and thus the motion blur can be ignored for both the in vitro and in vivo imaging. By excluding the motion blur, the OCT image blur is now mainly due to finite focal spot size (lateral PSF). We have experimentally verified that for a 6 mm diameter lateral scan area the focal spot size variation is < 6% for a 100 mm focal length lens (AC254-100-B, Thorlabs). The uniform lateral PSF can thus support the space invariant image blur model. For experiments discussed later in this paper, we have kept the lateral scan area within this 6 mm diameter or smaller. The slight spatial shifts among different C-scans will not affect the space invariant image blur model.
For in vitro C-scan 3D imaging, the stable sample and scanner holders can ensure no motion influence. For in vivo live tissue C-scan 3D imaging, the unintended tissue motion and vibration would introduce motion artifacts, but not motion blur. Our sparse C-scan takes only 0.7 s for a 128 × 128 scan matrix, which effectively reduces the possibility of motion artifacts in the 3D image. Moreover, we limit the warping to translational shifts. For in vitro imaging, we introduce sub-spot-spacing translational shifts. For in vivo imaging, the unintended tissue motion can be decomposed into pure translation shifts in x -, y -and z -axis for most OCT tissue imaging applications. Therefore, for multiple OCT C-scans image processing, it is reasonable to use the blurring-warping model (image blur first warping second). Limited to translational shifts with space invariant blurring, our blurring-warping model could benefit the superresolution processing [39,40,[42][43][44].
Using the image degradation model in Eq.
(3), we can recover the high resolution image ( , ) S x y in mathematical processing. Generally speaking, the recovering processing can be treated as minimizing the difference between the model and the measurement values. We estimate the approximate high resolution image Ŝ in a minimum p L norm problem [39] as 1 ArgMin .
Here, k I is the kth original low resolution image frame. k F is the motion operator for the kth low resolution image. D is the down-sample operator which can be simply determined as 1 8 , 1 4 or 1 2 by how many resolution improvements we expect (such as 8, 4 or 2 times) and the number of k I we have. H is the optical blur operator. In Eq. (4), we do not consider the noise [ , ] V x y because it is an additive term and can be suppressed by multi-frame superresolution processing. Due to the complexity of PSF in the SD-OCT system, we define G HS = as the image S convolved with a PSF. This can be used to solve the deconvolution problem [30].
Now, the minimization of p L norm problem can be separated into two steps: find a nondeconvolved high resolution image Ĝ from a series of low resolution image frames and then find a proper PSF to eliminate optical blur H and recover the expected image Ŝ from Ĝ .
There are reports on deconvolution to improve OCT image resolution [31][32][33]. Lucy-Richardson deconvolution [33,45,46] with a proper Gaussian PSF appears to be a widely accepted solution for recovering blurred images, where ˆ( , ) m S x y is the estimate of the undistorted image in m th interation. The optimization process starts with ( ) H x y is the lateral PSF of the system. Usually the Gaussian PSF is a common selection [31][32][33] owing to the focused beam profile following a certain Gaussian distribution. While in practice, the beam profile may not have circular symmetry for off-axis scanning. With scattering inside a sample, the focused beam may not retain near Gaussian distribution. Thus blind deconvolution [47,48] is an alternative solution, which uses maximum a posteriori probability (MAP) algorithm to automatically estimate the irregular PSF to deblur an image, avoiding the limitation of the regular PSF and exhibiting less artifacts in the final image. In this paper, we applied the blind deconvolution method introduced by D. Krishnan et al. [48]. In theory, the Airy disk [49], or diffraction limit of focusing, is related to the PSF of the optical system and thus it is possible to break the diffraction limit through deconvolution with a correct PSF. Although the Ŝ deconvolved from a Gaussian PSF would show obvious resolution improvement to Ĝ , this method may lead to ringing artifacts and output image quality reduction. Applying blind deconvolution with an estimated irregular PSF could perform better for practical applications. It is important to note that the deconvolution methods are usually sensitive to the noises which would limit their applications, to be explained later in the experiment Sec. 3.1. In this paper, we thus focus on the first step to reconstruct a high quality image Ĝ , but also provide deconvolved images for readers to compare.
If 1 p = , it is a 1 L norm problem, or a least-absolute problem. If 2 p = , it is a 2 L norm problem, or a least-square problem. 1 L norm is robust to outliers but with unstable multiple solutions and may penalize the high frequency signals. In most OCT applications presented in this paper, such as in vitro and in vivo imaging of tissues, we notice that the background are usually zero mean white noises along with structure related speckle noises without obvious outliers. Both the white noise and the speckle noise can be suppressed by processing with adjacent pixels [23] and multiple lateral images. Therefore, we applied a kind of 2 L norm normalized convolution algorithms introduced by H. Knutsson et al. [50] and T. Q. Pham et al. [43] to our SD-OCT system. The reason of selecting the normalized convolution algorithm [43,50] instead of other steepest descent algorithms is that it uses polynomial basis functions and local Taylor series expansion to consider the relation of a center pixel with neighborhood encompassing N pixels (for example, radius of 4 pixels). In other words, the final output value of each pixel is optimally solved [51] by adjacent ones, which is an effective method to reduce the speckle noise. In experiment section, through shifted C-scans and normalized convolution algorithm, we demonstrated that our superresolution technique can significantly reduce the background noise in final lateral and 3D images. Moreover, this algorithm allows us to design different weights on different adjacent pixels according to their distances to the center and only consider relative small reference radius, avoiding ghost patterns observed in output B-scan image. For motion sensitive applications, such as in vivo imaging of retina layers, sometimes we observed some outliers due to the motion artifacts caused by eye moving in the scanning procedure. Through SVP of multiple layers, the outliers could be effectively reduced. In the experiment Fig. 17, we will show the robustness of our method to suppress the outliers.

Estimating the unknown shifts to improve lateral resolution by multi-frame superresolution for in vivo imaging
The above superresolution processing is suitable for SD-OCT imaging of in vitro samples where the sub-spot-spacing shifts k F are known since the sample arm and the measurement sample are kept steady during the scan image acquisition. For in vivo SD-OCT imaging of live tissues, the shifts k F are unknown due to live body motion and vibration, which makes the superresolution processing complicated. We need an effective estimator to calculate the random shifts among different C-scans of the in vivo tissue. For these C-scan 3D images, the spatial shifts are introduced by body motion and we do not introduce sub-spot-spacing shifts through scan matrixes. In order to analyze the unknown shifts, we decompose the spatial shifts into two directions: one is in the depth z -axis and the other is in the lateral xy plane. Herein, we examine the tissues where the rotational motions can be ignored.
In the z -axis, the shift between any two C-scans can be corrected by comparing their 3D image top positions. While in the lateral xy plane, due to the complex intensity distribution, we need to find a better shift estimator. In order to more accurately estimate the shifts in the xy plane, firstly we make a SVP along the z -axis to enhance the contrast of key structures in the xy plane. Then we apply the popular image registration algorithmmulti-modal volume registration [52] to estimate the shifts between any SVP image and the reference one. According to the theory, we seek to maximize the mutual information between the reference image u and test image v: .
Here T is a transformation from the reference image to the test image.
We treat x as a random variable over coordinate locations in u and v. We seek the best transformation T using algorithms [52][53][54] to maximize the mutual information I between u and v . As the superresolution model we introduced above, the T here is a pure translation warping matrix. This T is considered the best motion operator k F in Eq. (4) for the kth low resolution in vivo image to the reference low resolution image. Figure 7 exhibits the effect of this algorithm, which perfectly estimates the shifts between images in (a) and (b) and thus the overlapping image with shift correction in (d) is clear. Using the effective motion estimator algorithm, we solve the problem on how to determine unknown lateral shifts between in vivo images. After obtaining the shifts k F , the superresolution processing as described in Sec. 2.3 can be applied for lateral resolution improvement. The multi-modal volume registration method offering quality shift estimation with fast computation speed is used in this paper.

SD-OCT image acquisition and superresolution processing
The SD-OCT image acquisition is a lateral spot scanning image acquisition process where the depth tomographic information is obtained by fast Fourier transform processing. Our superresolution processing is to analyze and improve the lateral resolution, which is in the xy image plane. First, we perform SD-OCT C-scan, acquiring a set of B-scan images ( , ) I x z in the x direction at different y (see Fig. 8(a) left). These multiple B-scans can be arrayed in sequence to generate a 3D matrix ( , , ) I x y z as shown in Fig. 8(a) middle. We then retrieve a set of xy images ( , ) I x y at different depth z as shown in Fig. 8(a) right. The lateral resolution improvement is to use several slightly position shifted xy images at an identical z position (see Fig. 8(b) middle) but from different C-scans (A, B, C, etc. from Fig. 8(b) left) to perform superresolution processing and yield a high lateral resolution image ˆ( , ) S x y as in Fig. 8(b) right. Repeat process in (b) layer by layer for all depth z values can yield a high lateral resolution 3D image in the whole scan space, not shown. Here, we emphasize two important points mentioned earlier: for in vitro sample imaging discussed in Sec. 2.3, our superresolution processing with overall less scan time can improve the lateral resolution and image quality much better than high density scanning and multiframe averaging techniques; for in vivo sample imaging as discussed in Sec. 2.4, a proper image registering technique can accurately estimate the unknown shifts among these C-scans followed by superresolution processing.

Lateral resolution, image quality and efficiency improvement
In order to compare our superresolution technique with designated shifts to high density or multi-frame averaging methods, we evaluate them in three aspects: lateral spatial resolution, image quality, and scan time.
1) Lateral spatial resolution. It is the ability to distinguish two closely spaced small objects, an important indicator of OCT systems. We apply a negative resolution targets (R3L3S1N -Negative 1951 USAF Test Target, Thorlabs), as shown in Fig. 9, to verify the improvement of the lateral resolution. This resolution target can provide 10 groups (−2 to + 7) with 6 elements per group, offering a maximum resolution of 2.19 µm. Considering the beam spot size, group 4~5 and 6~7 are suitable for resolution testing of our OCT system with 100 mm and 30 mm focal length lenses, respectively. The resolution (width of 1 line) of group 4~7 is listed in Table 2.  The resolution target pattern is a negative clear tone glass pattern, with chrome areas appear dark under back light illumination while transparent pattern areas are white. The reason for selecting these resolution target patterns is that the SD-OCT system is more sensitive to reflectivity enhancement than reduction. A resolution target with reflection reduction is better for judging the resolution ability of the SD-OCT system. Using the clear tone resolution target, the SD-OCT is to find the reduced reflectivity in the line patterns. Successfully imaging these patterns can demonstrate both the high lateral resolution and high sensitivity of our technique.
2) Image quality. We take peak signal-to-noise ratio (PSNR) and dynamic range (DR) to evaluate the image quality improvement. A PSNR definition is given as [23,55]: 10 20 log .
Here, noise RMS is root mean square of dark noise. Higher DR means higher image quality and we can distinguish more details in both dark and bright areas of an image. For an OCT system, we expect to discover more information of deeper layers in a higher DR 3D image.
3) Scan time. In order to compare the scan time of our superresolution technique with high density scan, we take the scan time of 64 × 64 matrix as unit 1 (~0.18 s) for reference. Higher density 128 × 128 scans takes scan time of 4 units. Superresolution with 9 shifted low density scans of 64 × 64 takes scan time of 9 units. In experiment, the scan data are stored in buffer to ensure the shortest scan time and the fast Fourier transform processing is subsequently performed. Shorter scan time is very important for in vivo 3D imaging and quantitative analysis of live tissues due to concern of tissue motion and vibration which introduce errors and artifacts [37]. Even for 3D imaging of in vitro samples, a short scan time would still be needed to minimize effect of tissue degradation and drying after being exposed in air. Using a 100 mm focal length lens with a NA of 0.015 we performed SD-OCT imaging of the resolution target shown in Fig. 9(a). In Fig. 10 and 11, we compare OCT images acquired at different scan matrixes and imaging methods at the same FOV area of ~1 × 1 mm 2 . All OCT test images were taken in the same experiment with the same focusing condition and light source power. The errors of light intensities among different C-scans were within 1% and can be ignored. The displayed images have 8-bit gray range. The scan matrix and imaging method is given in the first column and the corresponding scan time in the second column. Column 3 is the OCT lateral image. Column 4 shows the enlarged image of the blue area of the column 3 image with the first row being barely distinguishable and the last row indistinguishable. Here, we define Group i Element j on the resolution target as GiEj for easier reading. The background noise of the red area with noise statistics (STD, PSNR, RMS and DR values) is detailed in column 5. The comparisons on lateral resolution, image quality, and scan time in Fig. 10 are summarized below.

(M) The inverted -axis
x intensity profiles at the yellow line of patterns G5E6 of (G) and (H) showing resolution improvement in (H). By visual comparison, the image resolutions in (I) and (J) are obviously better than that of (H). *The statistical values are for reference only since so few equivalent pixels in ROIs of (A) to (C) cannot be directly compared to that of (D) to (L). SR in (H) is short for superresolution. SR w LR de in (I) is short for superresolution with Lucy-Richardson deconvolution. SR w blind de in (J) and (L) is short for superresolution with blind deconvolution. 1) Lateral spatial resolution. In Fig. 10(A), we show the reference low resolution image performed with 64 × 64 scan matrix without beam spot overlapping. Due to the low scan matrix density, we barely see the resolution pattern in G4E3 and the lateral spatial resolution is about 25 µm. When increasing the scan matrix to 128 × 128 (B), 256 × 256 (C), 512 × 512 (D), 1024 × 1024 (E), and 2048 × 2048 (F) at the same fixed FOV, we observe the lateral resolution improvement. This trend indicates that increasing scan matrix density in general can contribute to the lateral resolution improvement. However, when increasing the scan matrix density from 1024 × 1024 to 2048 × 2048, we do not observe significant improvement. Further increasing the scan matrix density will significantly prolong the scan time while not contributing to lateral resolution improvement. From this trend and Table. 2, we see that in (E) the barely visible G5E4 line width is 11.05 µm which is close to our experimental focused beam spot radius of 10.5 µm.
As mentioned earlier, the lateral resolution is limited by scan density, background noise, and diffraction limit. Further increase in lateral resolution should consider other limitations such as background noise. We applied the popular multi-frame averaging approach to reduce the background noise. With five-frame averaging of 1024 × 1024 scanned images, we achieved improved image as in (G) showing visibility of G5E5 of 9.84 µm line width but cannot see G5E6 as indicated in (M) left. Averaging of more frames such as ten frames are found helpful to noise reduction but cannot improve the lateral solution to the next level of visibility to G5E6 (not shown here). The ten-frame averaging takes too much time and is not acceptable in a high density C-scan imaging.
Compared with high scan density and multi-frame averaging methods, our superresolution of low scan matrix density C-scans with designated shifts can demonstrate lateral resolution improvement advantages. Figure 10(H) shows our superresolution processed image of 961 low resolution shifted frames with 1/16-spotspacing step and maximum 15/16-spot-shift. It is a 31 × 31 shift matrix with arrangement similar to Fig. 6(d). From the enlarged resolution image, we can distinguish G5E6 of 8.77 µm line width which is verified in (M) right. Moreover, in order to verify the effectiveness of the superresolution algorithm, we upsampled the same 961 input low resolution images by bicubic interpolation to the same image size as (H), and then averaged them with shift compensation. The final output image (K) has the same pixel resolution as (H) but much worse spatial resolution, barely observing 12.40 µm line width pattern (G5E3). This demonstrated that the resolution improvement is from both sub-spot-spacing shifts and the superresolution algorithm instead of barely from increasing the sampling rate.
Further increase in lateral resolution can be performed by Lucy-Richardson deconvolution processing of Fig. 10(H) with an optimized Gaussian PSF or by blind deconvolution showing improved images in (I) and (J) with visibility of G6E1 of 7.81 µm line width without additional hardware configuration and scanning changes. The superresolution with deconvolution processing significantly enhances the contrast of the resolution pattern. All three lines now in G5E6 in (I) and (J) are much clearer than in (H). In another word, the deconvolution processing is very helpful to the lateral modulation transfer function (MTF) of the OCT system. The selection of the optimized Gaussian PSF was performed by iteration selection of Gaussian parameters to achieve the best output image. Compared with Lucy-Richardson deconvolution with a regular Gaussian PSF, the blind deconvolution can estimate an irregular PSF and deblurred the (H) with less ringing artifacts, although the background appears a little noisy. Thus, for the following deconvolution images, we mainly use the blind deconvolution method. We also showed the Gaussian PSF or estimated PSF at the right bottom of the deconvoluted image.
Compared with original C-scan (A), our superresolution technique improved the lateral resolution from 25 µm to 8.77 µm (without deconvolution) and to 7.81 µm (with deconvolution), a factor of ~3 improvement. From the above comparison, we can summarize that for lateral resolution improvement the superresolution of low density C-scans is better than multi-frame averaging of high density C-scans which is better than simple high density C-scan. The superresolution with deconvolution processing will further improve the lateral resolution.
According to Rayleigh criterion [49], the resolution limit of an optical system is about half of the focused spot size. Our present measured beam spot radius is ~10.5 µm similar to the 9.84 µm line width of G5E5 in Fig. 10(G). This agrees well with the theory of diffraction limit and Rayleigh criterion. We can actually observe the G5E6 line width of 8.77 µm in (H) slightly better than the spot radius with superresolution processing due to increase of pixel density, reduction of noise, and enhancement of image contrast.
The resolution of an optical system is physically restricted by the diffraction limited focusing spot which is related to the PSF of the optical system. High spatial resolution pattern which cannot be distinguished clearly is due to finite spot size blurring. The deconvolution processing with a proper PSF can break the diffraction limit for resolution and sharpness improvement. Superresolution processing with Lucy-Richardson deconvolution using an optimized Gaussian PSF in Fig. 10(I) or with blind deconvolution in Fig. 10(J) can see well the G5E6 line width of 8.77 µm with high image contrast and even see the next group element G6E1 with 7.81 µm line width, both breaking the diffraction limit and clearly showing the resolution improvement advantages.
2) Image quality. Simple high density scan did not improve the image quality. For all the six images in Fig. 10(A) to 10(F), the PSNRs were almost lower than 20 dB meaning that the image quality was not good. Increasing scan density did not help the image quality improvement. As mentioned earlier, with the same focusing condition and light source power, the six images should have similar image quality values such as PSNR and DR. However, we see larger PSNR and DR in (A) to (C) due to not enough pixel numbers in ROI which reduces the statistics reliability. Thus, we use scan matrix of 1024 × 1024 in (E) and 2048 × 2048 in (F) for image quality comparison with multi-frame averaged image in (G) and superresolution processed images in (H), (I) and (J). Through five-frame averaging, the PSNR is improved to 23.39 dB in (G) compared to 17.50 dB in (E). A 10-frame averaging can further reach 27.75 dB (not shown) but it is still lower than 30 dB while doubling the image scan time than fiveframe averaging. The superresolution processed image in (H) can achieve 31.50 dB PSNR, almost doubling the high density scan results in (E) and (F). Although all images in Fig. 10 have the same minimum and maximum 8-bit intensity values of 0 and 255, we recognize that the superresolution processed image show better brightness and contrast. The higher DR values in (H), (I), and (J) make brighter appearance to human eye observation. The DR value of (H) also almost doubles that values of (E) and (F). The superresolution with deconvolution processed images in (I) and (J) decrease a little in the image quality PSNR and DR as compared to (H). This is due to the increased sharpness of background noise caused by the deconvolution processing. Here, we simply summarize from the comparison of image quality that the superresolution processing is better than multi-frame averaging which is better than high density scan. The superresolution with deconvolution improves image resolution but slightly reduce the image quality.
Moreover, we noticed that (K) has the best PSNR and DR value of all the images of Fig. 10, which is from averaging the upsampled input images (the same as (H)) with shift compensation. The STD value is only 1.63, exhibiting very smooth background without obvious noises. If only comparing the image quality values, we may have a misleading conclusion that upsampling and averaging method can provide better background noise suppression and speckle noise reduction than the superresolution technique. However, as we discussed above, the simple upsampling and averaging with shift compensation method is harmful to the lateral resolution improvement, which can lead to the maximum resolution of 12.4 µm, poorer than simple high density scan in (E). The upsampling method, such as bicubic interpolation, will penalize the high frequency signals in the original low resolution image, reducing the edge sharpness and smoothing the background. Thus it is not an acceptable method for most OCT applications.
3) Scan time. From column 2 of Fig. 10(F)~10(J), we can simply summarize that the scan time of the superresolution technique is faster than both the high density scan and the multi-frame averaging. Superresolution processing can provide much better resolution and image quality with less scan time. These comparisons of different scan imaging methods are for achieving resolution limit. In practice, we need to consider acceptable scan time. For example, the present scan time of Fig. 10(A) for FOV of ~1 × 1 mm 2 takes 0.18 s while that of (F), (G), and (H) take 3.41, 4.27, and 3.2 minutes, respectively. Increasing the FOV to ~3 × 3 mm 2 area at the same scan density of (F), (G), and (H) would take 30.7, 38.4 and 28.8 minutes (excluding fast Fourier transform calculation) which may be too long for many practical applications. To reduce scan time for practical applications, we compare a list of superresolution processed images in Fig. 11 with much fewer C-scans than Fig. 10(H). We first show that the gray shifts as in Fig. 6(c) and 6(d) are needed for higher lateral resolution and image quality. Although the red dot shifted patterns are enough for lateral pixel resolution improvement by superresolution processing, those additional gray dot shifts contribute to image noise reduction, the lateral spatial resolution and overall image quality improvement. In Fig. 11(A), we cannot see the pattern G5E3 with Fig. 6(b) scan strategy. While scanning with Fig. 6(c) strategy, the pattern of G5E3 in Fig. 11(B) is clearly visible and we can further partly distinguish the G5E4 pattern. Similarly, in Fig. 11(D), the structure pattern of G5E5 is not visible with red scan pattern of Fig. 6(d). Including the gray scan patterns of Fig. 6(d), the G5E5 pattern in Fig. 11(E) becomes visible. Thus, these gray shifts can effectively improve the lateral resolution to the next level in the resolution test target as well as reduce the image noise by about 20~70% in STD and RMS.
The superresolution processing with deconvolution has demonstrated its contribution to the resolution improvement as shown in Fig. 11(C) and 11(F) as compared to (B) and (E), respectively. In Fig. 11(C), we applied Lucy-Richardson deconvolution instead of blind deconvolution due to less artifacts. The deconvolution again does show some degradation to image quality with increasing background noise as in Fig. 10(I) and 10(J). It is important to note that the superresolution with deconvolution does not spend any extra scan time.
Comparing with Fig. 10(B) to 10(E), the results in Fig. 11 clearly show the advantage of multi-frame superresolution processing with acceptable scan time while offering resolution and image quality improvement. Reducing from 961 to 49 shifted C-scan image acquisition, it now takes only about 9.8 seconds to see the 9.84 µm line width pattern in Fig. 11(F), while the 1024 × 1024 high density scan in Fig. 10(E) takes about 51 seconds to see the 11.05 µm line width pattern with lower image quality. Similarly, Fig. 11(A) to 11(E) provides better lateral resolution with similar or shorter scan time than Fig. 10(B) to 10(E). Clearly, the superresolution technique has demonstrated superior performance in lateral resolution and image quality improvement with reasonable short scan time.
Based on lateral resolution improvement experiments, the resolution and image quality vs. scan time are summarized in Table 3. Clearly, the multi-frame superresolution technique can achieve much better lateral resolution and image quality with less scan time than high density scanning and multi-frame averaging. Fig. 11. A list of superresolution processed images with fewer C-scans. Superresolution processed images (A) and (B) with scan strategies as in Fig. 6(b) and 6(c), respectively. Superresolution processed images (D) and (E) with scan strategies as in Fig. 6(d) without and with gray shifts, respectively. Superresolution processing with deconvolution using an optimized Gaussian PSF for image (B) is shown in (C) and the Gaussian PSF is shown at the right bottom of (C). Superresolution with blind deconvolution for image (E) is shown in (F) and the estimated PSF is shown at the right bottom of (F). *The values are for reference only due to low pixel numbers in ROIs. SR in (A) to (E) is short for superresolution. SR w LR de in (C) is short for superresolution with Lucy-Richardson deconvolution. SR w blind de in (F) is short for superresolution with blind deconvolution. The NA of the measurement arm lens is 0.015 with focused beam spot size of about 21 µm. The lens focal length is 100 mm. a The scan time here take the original low density 64 × 64 C-scan without spot overlapping as unit 1 for reference. b 64 2 means 64 × 64, 128 2 means 128x128, and so on. c The higher, the better. d 'de' means superresolution processing with deconvolution shown in Fig. 10 and 11. *The values are only for reference.
In addition to the experiment using 100 mm focal length lens above, we also performed experiment using a 30 mm focal length lens with a focused beam spot size of ~6 µm to image the resolution target patterns group 6~7 of Fig. 9(b). Figure 12 exhibits the original, the high density scanned, the multiple high density scan averaged, and superresolution with deconvolution processed images. The original low density scan (64 × 64) cannot distinguish any pattern, except the G6E1 with line width of 7.81 µm as in Fig. 12(a). With 1024 times higher scan matrix density (2048 × 2048) or five-frame averaging of 1024 × 1024 scanned image (taking 1280 scan time unit), the 3.10 µm (G7E3) and the 2.76 µm (G7E4) become barely visible as shown in Fig. 12(b) and 12(c). After multi-frame superresolution processing with blind deconvolution of 961 shifted image frames with 1/16-spot-spacing shift step and maximum 15/16-spot-shift, we can see the 2.19 µm patterns (G7E6) as in Fig. 12(d). The lateral resolution has been significantly improved from 7.81 µm (the original one in Fig.  12(a)) to 2.47 µm (superresolution without deconvolution, not shown) and to 2.19 µm (superresolution with blind deconvolution in Fig. 12(d)), about 3 and 3.5 times improvement.
Compared with high density scan and multi-frame averaging, our superresolution technique again exhibits obvious advantage in lateral resolution improvement. The superresolution technique further shows the apparently better image quality than other methods: PSNR and DR of 103.7% and 137.9% (without deconvolution) and 65.2% and 106.3% (with deconvolution, Fig. 12(d)) higher than high density scan ( Fig. 12(b)); PSNR and DR of 50.9% and 60.5% (without deconvolution) and 22.4% and 39.2% (with deconvolution) higher than multi-frame averaging (Fig. 12(c)). Similar to the results of using 100 mm focal length lens system, the use of 30 mm focal length lens demonstrates again that the superresolution technique can offer higher lateral resolution and better image quality with less scanning time than high density C-scan and multi-frame averaged images. The present Lucy-Richardson deconvolution with a Gaussian PSF or blind deconvolution however have some problems: 1) Although the deconvolution effectively improves the lateral resolution, it introduces some artifacts around resolution pattern and in the background area, as shown in Fig.  10(I) 10(J), Fig. 11(C) 11(F) and Fig. 12(d), which may not be acceptable for some critical applications. The artifacts are from both imperfect PSF selection and the discrete Fourier transform.
2) Although the blind deconvolution performs better than Lucy-Richardson deconvolution, it still cannot avoid the increase of background noise. To our observation, the deconvolution methods are sensitive to the noise level. If background noise is as low as Fig. 10(K), the deconvolution processing will not introduce a lot of artifacts shown in Fig. 10(L). However, the method in Fig. 10(K) is harmful to the spatial resolution improvement. Practically, for most biomedical applications, it is difficult to obtain a penetrated image with so smooth background as well as maintaining high resolution due to various scattering mediums.
3) After beam penetrating into a sample, the scattering would alter the beam intensity distribution. The optimized PSF thus may be different in different tissue samples and at different sample depth layers [31]. Even with advanced blind deconvolution algorithm, it is still difficult to estimate the ground true PSF [57]. The estimated optimal PSFs in Fig. 10(J), 10(L), Fig. 11(F) and Fig. 12(d) are different. Because of the above issues, we did not apply deconvolution processing to biomedical OCT images in following tissue imaging experiments. It is however important to point out that the superresolution processing with deconvolution can solve the diffraction limit, sampling rate and background noise problems together to significantly improve the lateral resolution and image quality.

Improved lateral resolution imaging of in vitro skin sample
Thus far, we have successfully demonstrated the lateral resolution and image quality improvement by the multi-frame superresolution processing of low resolution C-scan images with sub-spot-spacing shifts. The multi-frame superresolution processing of C-scan SD-OCT images can offer better image quality with less scan time than high density C-scan images, and is especially suitable for quick in vitro scan imaging of fresh biomedical samples because their physical or chemical properties may change with time (such as drying and oxidation) after exposing in the air. We examined in vitro 3D tissue imaging of fresh skin samples. Figure 13(a) and 13(b) show the multi-frame superresolution processed high resolution 3D in vitro image of a chicken wing skin at different angles, using 25 shifted C-scans (similar to Fig. 6(c), with 1/2spot-spacing shift step and maximum one-spot-shift) acquired by the SD-OCT system. From the high resolution 3D images of Fig. 13(a) and 13(b), we clearly observe a chicken hair and pore on the wavy skin. We further compare a layer xy image about 600 µm under the skin surface as shown in Fig. 13(c), 13(d), 13(e) and 13(f). Following the yellow arrows, the processed image (e) (from nine 1/2-spot-spacing C-scans, using scan strategy of Fig. 6(c)) shows more visible details and structures that were not clear in the original low density scan image (c). The image (e) is also much better than high density scan image (d) in terms of resolution and low noise performance, which were acquired with 16 times higher scan density than (c) with almost double the scan time as (e). Here we did not provide averaging of five high-density images due to its 10× scan time than (e). A fresh chicken wing sample would dry to an observable level during the long scan time, making the data averaging of the high density scans impractical. The image (f) shows very detailed structures and low noise due to superresolution processing of 25 shifted image frames. Furthermore, we explored the advantage of the superresolution technique over the average of upsampled images (by bicubic interpolation) with shift compensation. As we discussed in Fig. 10, the upsampling and averageing method is helpful to the noise suppression, but has little help to the lateral resolution improvement. In order to exhibit the resolution improvement comparison, we scanned another fresh chicken wing sample with more microstructures about 450 µm under skin surface, as shown in Fig. 14. The visibility of these fine structures can serve as a good indictor to compare the lateral resolution. The processed lateral images Fig. 14(a) and 14(b) are from the same input low resolution images with fortynine shifted image frames of 1/4-spot-spacing shift step and maximum 3/4-spot-shift, using scan strategy of Fig. 6(d). Enlarging the yellow arrow areas in Fig. 14(b), we clearly observed fine structures under surface by superresolution processing. On the other hand, the average of the upsampled images with shift compensation cannot distinguish these microstructures, appearing like an out-of-focus blurred image, shown in Fig. 14(a). This result agrees well with the discussion in Fig. 10 above. This biomedical experiment demonstrates that our superresolution technique can effectively improve the lateral resolution and exhibiting obviously better image quality than the average of upsampled ones. The multi-frame superresolution technique offers significantly improved lateral image quality with less scan time than high density scanning, upsampling and averaging method should significantly benefit OCT imaging of in vitro fresh tissues and organs such as skins, kidney [58], trachea [59], and liver [60].

Improved lateral resolution imaging of in vivo samples
Our superresolution technique has demonstrated its advantages in lateral resolution and image quality improvement for in vitro imaging of resolution target and samples. For in vivo tissue imaging, there is concern of live tissue movement associated to body motion caused by breathing, blood flow pulsing, and unintended muscle movement. Long time high density scan easily introduces motion errors and artifacts to 3D images. Using our sparse matrix scan method, the SD-OCT image acquisition time of each 128 × 128 C-scan (excluding fast Fourier transform OCT image processing time) is only about 0.7 second, fast enough to ignore most motion image artifacts within each C-scan cycle. Ten such C-scans take only 7 seconds which is accepted to most imaging applications. For in vivo imaging, the amount and direction of unintended tissue movement (motion shifts) among different C-scans are unknown. We thus need to estimate the tissue movement among multiple C-scans and then perform superresolution processing for image resolution improvement. Figure 15 exhibits the multi-frame superresolution processing of seven in vivo low resolution C-scan human skin (a 30-year-old male volunteer) images to generate a final high resolution 3D image. Here, we did not introduce special lateral scan shifts to the seven low density C-scans. There are slight unknown shifts in these C-scans due to random minor unintended body motion and vibration. To apply the multi-frame superresolution processing to in vivo human skins, we first estimated the spatial shifts in all x -, y -, z -directions. The z -axis shift was found by comparing the intensity distributions of any two C-scans in the depth direction. After z -axis shift compensation for all C-scan images, we applied the multivolume registration algorithm to estimate the lateral shifts among these SVP images of different C-scans. We compensated the lateral shifts and then overlapped the compensated SVP image with the reference one to check the overlapping quality, as shown in Fig. 15(a)-(c). In Fig. 15(b), without shift estimator, the overlapping of two images in (a) resulted in significantly image blurring and non-overlapping of a hair structure on the top right. Through lateral shift compensation, the two images of Fig. 15(a) overlapped well in Fig. 15(c). After collecting the x -, y -, z -position shifts information which produces best overlapping after compensation, multi-frame superresolution processing was then performed layer-by-layer to improve lateral resolution for the C-scan 3D image. Figure 15(d) and 15(e) exhibit the lateral images in a deep layer (about 800 µm under the skin surface) without and with multi-frame superresolution processing, respectively. The processed image of Fig. 15(e) shows much improved tissue details and structures that cannot be easily identified in the original image of Fig. 15(d). Figure 15(f) and 15(g) are the final high resolution 3D image after the superresolution processing showing clearly wrinkles on the skin. Here we use two different color maps in Fig. 15(f) and 15(g): the hot color map in (f) shows skin surface situation and the jet color map in (g) exhibits well the intensity differences of all the image pixels in the 3D space. The superresolution processing also improves the B-scan image quality. We further performed OCT in vivo imaging of human fingerprint since the visibility of the eccrine sweat glands can serve as a good indictor to the OCT image resolution. Figure 16(a) shows an original low resolution fingerprint B-scan image covering about 5 mm scan width on a thumb (a 33-year-old male volunteer). We only observe external fingerprint but with very blurred images of the eccrine sweat glands and the internal fingerprint structures. The two right side images are enlarged areas corresponding to the areas of the yellow and blue arrows. The yellow rectangle image can see a blurred eccrine sweat gland but cannot distinguish the helical structure. The blue square image does not exhibit any information of eccrine sweat glands. After superresolution processing with ten C-scans, we extracted one B-scan image shown in Fig. 16(b) from the final high quality 3D image (Fig. 16(c)) at the same position of Fig. 16(a). The lateral image improvement processing is also helpful to the axial B-scan image. In Fig. 16(b), the helical structure of the eccrine sweat gland marked by the yellow arrow is clearly visible and enlarged at the right side. The three eccrine sweat glands have different intensity because their centers are not in the same B-scan plane. The superresolution processed B-scan exhibits excellent image quality with 49.9% PSNR and 50.6% DR improvement. The improvement from Fig. 16(a) to 16(b) can be clearly visualized. After segmenting the multi-layer fingerprint 3D image (Fig. 16(c)) into three layers: external fingerprint layer, eccrine sweat glands layer and internal fingerprint layer, the curved layer images are shown in Fig. 16(d), 16(e), and 16(f), respectively. The distribution of eccrine sweat glands in the whole scan area are beautifully displayed in Fig. 16(e). The 3D finger print features are shown in Fig. 16(d) and 16(f). We also applied the superresolution technique for in vivo retina imaging, a time sensitive imaging due to potential eye motion. Human eyes often have unintended motion owing to environmental attractions or disturbances and the motion amounts would likely be more than that of unintended skin tissue motions. To minimize OCT image errors and artifacts [37] caused by minor eye motion, we performed fast C-scan imaging. It appears that the 0.7 second scan time for 128 × 128 C-scan is acceptable without observable motion artifacts in most situations. After ten C-scans of a 33-year-old male volunteer, we projected multiple lateral layers in one C-scan to a flat plane to generate SVP images with enhanced distribution structure of blood vessels. This method has been widely applied in ophthalmic research for imaging of micro blood vessels' distribution [29,61]. Figure 17(a) shows one of the ten low resolution lateral SVP images covering ~3.5 × 3.5 mm 2 area. Main vessels are visible while smaller vessels are difficult to distinguish such as at yellow and blue arrows. Figure 17(b) is a bad quality C-scan with obvious artifacts especially in the left bottom where the main vessel structure pointed by the pink arrow is not seen. After superresolution processing from the acquired ten in vivo SVP images including the bad quality Fig. 17(b), the improved image in Fig. 17(c) reveals smaller vessels in branching structures which cannot be seen at yellow and blue arrows in Fig. 17(a) and 17(b). The main vessels are also less noisy. Overcoming the bad quality SVP image in generating fine quality output image demonstrated the effectiveness and robustness of our superresolution technique for suppressing outliers. By removing background, image with clear vessel structures is given in Fig. 17(d).
In the situation involving motion artifacts and non-pure translational shifts, future research may consider decomposition of images into a series of smaller patches each with only translational motion and minimal motion artifacts [62]. Blurring-warping model and superresolution technique presented herein should still be useful for image resolution and quality improvement. Compare with current layer by layer processing, the cloud computing service may speed up parallel processing of all the 3D images layers at the same time. Future research may also explore the correlated relations among neighboring depth layers for further axial resolution improvement.

Conclusion
In conclusion, we have successfully demonstrated the lateral resolution and image quality improvement of SD-OCT by multi-frame superresolution technique for both in vitro and in vivo tissue sample imaging applications. Through adjusting the matrix of control voltages to the galvanometer scanners, we can artificially introduce suitable sub-spot-spacing shifts to low density C-scans for in vitro sample imaging. After the superresolution processing of a set of these C-scan images, we have demonstrated about 3 times lateral resolution improvement, from 25 µm to 7.81 µm and from 7.81 µm to 2.19 µm with measurement arm lens NA of 0.015 and 0.05, respectively, using a known resolution target. Background noise reduction and image quality doubling without sacrificing the DOF of the focusing beam and lateral FOV has also been attained. We found that the superresolution technique offers higher lateral resolution and better image quality with less scan time than high density C-scan, multi-frame averaged image, and the upsampling and averaging method.
It is found that Lucy-Richardson deconvolution with an optimized Gaussian PSF and the advanced blind deconvolution may potentially break the diffraction limit to improve the lateral spatial resolution of an OCT system. Since the PSF is highly dependent on sample tissues and is different in different samples and at different depth layers, in practice it may not be easy to find the ground true irregular PSF even with blind deconvolution. Also, deconvolution is sensitive to noise signals. Nevertheless, we have compared superresolution processed OCT images with and without deconvolution processing to show its conceptual significance in lateral resolution improvement.
For in vivo tissue imaging, due to the concern of live body unintended minor motion and vibration, multi-volume registration algorithm is used to estimate translational shifts in xy plane without artificially introducing sub-spot spacing shifts, and the z -axis shifts can be estimated and compensated. The estimated shifts can then be used for superresolution processing for lateral resolution improvement. 3D images, layered 2D lateral images, and Bscan images of in vitro and in vivo skin tissues, finger prints, and retina layers have all shown remarkable resolution improvement and image quality improvement as compared to original single C-scan and high density C-scan images. By keeping the bad quality images with obvious artifacts in the input image sequence, our superresolution technique also demonstrated the robustness to suppress the outliers. Although the present study uses a SD-OCT system, the superresolution technique should benefit other scan based OCT imaging system including time domain OCT and swept source OCT and should benefit various OCT imaging applications with remarkable improvement in image resolution and quality.