Ultrahigh speed en face OCT capsule for endoscopic imaging

: Depth resolved and en face OCT visualization in vivo may have important clinical applications in endoscopy. We demonstrate a high speed, two-dimensional (2D) distal scanning capsule with a micromotor for fast rotary scanning and a pneumatic actuator for precision longitudinal scanning. Longitudinal position measurement and image registration were performed by optical tracking of the pneumatic scanner. The 2D scanning device enables high resolution imaging over a small field of view and is suitable for OCT as well as other scanning microscopies. Large field of view imaging for screening or surveillance applications can also be achieved by proximally pulling back or advancing the capsule while scanning the distal high-speed micromotor. Circumferential en face OCT was demonstrated in living swine at 250 Hz frame rate and 1 MHz A-scan rate using a MEMS tunable VCSEL light source at 1300 nm. Cross-sectional and en face OCT views of the upper and lower gastrointestinal tract were generated with precision distal pneumatic longitudinal actuation as well as proximal manual longitudinal actuation. These devices could enable clinical studies either as an adjunct to endoscopy, attached to an endoscope, or as a swallowed


Introduction
Catheter based optical coherence tomography (OCT) enables high resolution, volumetric imaging of organ surfaces and luminal walls [1]. The rapid advancement and speed improvements of wavelength swept laser sources [2] enable in vivo three-dimensional (3D) OCT imaging at high frame rates. Increases in laser sweep rate have been accompanied by advances in optical scanning probe technologies. Early work with swept source OCT (SS-OCT) and a rotary scanning fiber optic probe with pullback demonstrated the feasibility of generating 3D gigavoxel image volumes [3,4]. Later SS-OCT studies used novel probe designs such as a balloon catheter [5][6][7], MEMS scanner [8], and resonant fiber scanning [9][10][11]. The distally scanning micromotor OCT probe was first demonstrated a decade ago [12,13] and more recently, multiple groups have achieved high frame rate 3D-OCT imaging over large fields using micromotor scanners [14][15][16]. Using ultrahigh speed MEMS tunable VCSEL laser technology and micromotor scanners, our group recently demonstrated endoscopic OCT angiography (OCTA) and en face OCT in the upper and lower gastrointestinal (GI) tract in human subjects [17,18].
However, rotary scanning to acquire volumetric data in a luminal organ has traditionally been performed using a proximal motorized pullback for the longitudinal, slow scan direction. Proximal actuation transmits force via a long tether, such as a torque cable, which is subject to mechanical deformation such as stretching and twisting. This limits the actuation stability at the distal end of the probe and degrades scan accuracy along the longitudinal axis. Probes with distal resonant fiber scanning have been widely investigated [9,11,19,20], but the scan speed is not adjustable, image sampling is non-uniform because of the need for resonant scan patterns such as the spiral or Lissajous, and imaging is primarily in the forward direction. Two-dimensional (2D) non-resonant fiber scanners have been developed [21-24], but typically require high voltage actuation for the non-resonant axis/axes, which could limit clinical utility. A novel side-viewing probe based on the piezoelectric 'squiggle' motor achieved 2D distal side scanning using a rotating leadscrew [25], but the scan axes could not be controlled independently.
Wireless capsule endoscopy is a well-known and clinically accepted modality for comprehensive imaging of the entire gastrointestinal tract [26]. However, the wireless capsule travels rapidly down and away from the esophagus, and it is not possible to slow down or steer the device for high resolution, closer inspection of the esophageal wall. It was found that simply tying a string to a capsule endoscope was a useful and well tolerated method for diagnosing Barrett's Esophagus (BE) with good diagnostic sensitivity [27][28][29]. Seibel et al. developed a tethered capsule endoscope with forward viewing capability based on resonant fiber scanning which could perform white light imaging of the esophageal wall at video rates [30,31]. The tethered capsule had the advantage that it could be swallowed by unsedated patients for screening applications, without the need for conventional endoscopy which requires sedation. More recently, the tethered capsule concept was demonstrated using SS-OCT for circumferential 3D OCT imaging [32], as well as spectrally encoded confocal microscopy [33]. These devices had a highly flexible and soft tether that was shown to be easily swallowed and well tolerated by patients [34]. In this design, the distal optics were scanned in the rotary direction via a proximally actuated torque cable and the capsule was allowed to passively traverse the esophagus by peristalsis, then manually retracted (pullback) by the soft tether in order to scan the longitudinal extent of the esophagus. Manual pullback in order to perform the longitudinal scan is rapid and convenient in a screening context, but may not be stable or precise enough to achieve high resolution volumetric OCT for en face OCT, especially at limited frame rates supported by proximal torque cable actuation. In addition, precision longitudinal beam scanning is required for scanning microscopies such as confocal, fluorescence or multiphoton microscopy imaging. Therefore, there is a need for device technologies which can perform high speed rotary beam scanning combined with precision longitudinal scanning, maximizing image quality and reducing intraluminal device time.
En face features in the GI tract, such as mucosal surface patterns or 'pit patterns', are known markers for disease [35,36]. The ability to resolve these patterns with en face OCT could improve diagnostic sensitivity. However, image quality depends on optical resolution and scan pattern repeatability. Accurate and stable beam scanning enables imaging of structural features in the en face plane, as well as more advanced functional methods such as OCTA [37]. Previous probe designs for circumferential luminal scanning such as balloon and capsule catheters were limited by scan instability due to proximal rotary and longitudinal actuation, and relatively poor transverse resolution because the optics have to be centered in the large diameter probe. Small diameter probes can have better transverse resolution, but cover only a limited portion of the luminal circumference, such that multiple pullbacks over different sectors of the esophagus are required to survey the entire circumference.
This paper describes a capsule probe that incorporates 2D circumferential and longitudinal beam scanning in the distal end, thus eliminating mechanical instability from proximal torque cable actuation. The device has a 3.8 mm diameter tether that is semi-rigid to allow manual positioning, pulling back or advancing the device, depending on the mode of operation, such that peristaltic propulsion is not necessarily required. The rotary scanner is a micromotor, and the longitudinal scanner is a novel pneumatic actuator. The optical design enables focusing elements to be positioned close to the tissue surface, to improve transverse resolution. Additionally, the path length of the OCT sample arm is designed to vary with the longitudinal pneumatic scan, enabling non-uniformities in the longitudinal scan trajectory to be tracked and corrected in post-processing. The longitudinal scan can also be performed by proximal manual actuation of the entire capsule (large field coverage) via the tether, versus by distal pneumatic actuation of an internal carriage with the capsule kept stationary (small field, high stability inspection). Compared to our previously reported micromotor probe [15], the capsule probe has significantly improved field of view and 2D distal scanning capabilities. Images were acquired in the living swine esophagus, rectum, and anal canal. This study serves as an important translational step in validating the ultrahigh speed capsule OCT technology for potential clinical applications in screening and surveillance. In particular, the ability to image a wide range of field sizes, in both the upper and lower GI tract, and in sedated subjects promises to enable new applications of OCT as a next-generation GI imaging modality.

Imaging system
The OCT system used a dual circulator Michelson interferometer as shown in Fig. 1. The data acquisition card (AlazarTech, Quebec, Canada) had 12 bit resolution and was rated at up to 1.8 GS/s with internal clocking, but was limited to lower sampling rates when externally clocked. The acquisition card was optically clocked by a Mach-Zehnder interferometer with a variable fringe frequency of up to 1.1 GHz, which was the maximum frequency that the A/D card could perform variable clocking. The laser source was a 1300 nm MEMS-based verticalcavity surface-emitting laser (VCSEL) [38], and was driven sinusoidally at 500 kHz by an arbitrary waveform generator and high voltage amplifier. Both forward and backward wavelength sweeps of the laser were used to achieve an effective axial scan rate of 1 MHz. The VCSEL sweep bandwidth was 115 nm, the measured axial resolution was 12 µm in air (8.5 μm in tissue), and the Nyquist imaging range was 2.3 mm in air (1.6 mm in tissue). Sensitivity roll off was negligible across the imaging range because of the long coherence length of the VCSEL and high bandwidth of the detectors. The coherence length of the VCSEL has been previously reported to be more than 100 mm in air [39]. The power emitted from the imaging capsule was 35 mW. System sensitivity was measured using an isolated reflection from a flat-cleaved fiber patch cord with 0.6% reflectivity, with additional attenuation from a 17 dB single pass attenuator in the sample arm. The sensitivity was measured to be 105 dB, defined by the ratio of the OCT signal to the standard deviation of the magnitude of the OCT signal with the reflection blocked. The transmission through the capsule probe was 70 percent. The backcoupling efficiency of a mirror reflection into the probe was estimated to be 50 percent. The effective sensitivity of the OCT imaging engine with the capsule was ~102 dB.
A custom C++ software was used to control the beam scanning and data acquisition. The OCT interferometer reference mirror was on a motorized translation stage that was triggered by the acquisition software at a preset timing delay. The translation range of the pneumatic actuator in the capsule is about 3.5 mm, which is larger than the system imaging range. Thus the reference mirror position was scanned at a speed matched to the maximum speed of the pneumatic actuator, in order to enable imaging over the full longitudinal translation range. The system was installed on a portable cart for convenient transport to the animal housing facility for imaging experiments.

Catheter design and assembly
A schematic of the capsule device is shown in Fig. 2(a), and photographs in Figs. 2(b) and 2(c). A collimated beam was reflected at a right angle by a prism and focused by a microlens to a focal plane outside of the glass tube enclosure. The microlens was mounted on a counterweighted lens holder on the shaft of a micromotor. The micromotor rotated the lens holder to circumferentially scan the focused beam. The micromotor was mounted on a carriage, which was centered in a glass tube by multiple contact points with non-marring set screws. The carriage was attached to a bellows that was expanded and contracted by pneumatic inflation and deflation via an inflation tube. When the bellows was actuated, the carriage slid smoothly along the glass tube, generating a precision longitudinal scan.
A commercially available, low cost miniature fiber coupled collimator (AC Photonics, CA) with 1.4 mm diameter and 6 mm rigid length was used to generate the input beam. The objective was a stock telecom-coated planoconvex lens with 2.0 mm diameter and 4.0 mm focal length. The objective was mounted in a lens holder that was rapid-prototyped by high resolution stereolithography (In'Tech Industries, MN). A stock 2.0 mm prism was mounted in a square hole in the center of the lens holder. The prism was angled at 6 degrees to minimize backreflections from the glass tube surface into the collimator. The lens was counterweighted by an identical lens to minimize vibration during rotation. The focal plane was located 1 mm from the glass surface assuming tissue contact, with a 26 μm full width at half maximum focused spot diameter.
The lens holder was designed and rapid-prototyped such that the prism and lens were aligned. The collimator (centered in the torque cable and mounted in the center of proximal end cap) and lens holder (mounted on the micromotor shaft and centered in glass tube) were also aligned coaxially. High precision, rapid-prototyped parts enabled the optical components to be self-aligning when assembled. In this design, longitudinal translation of the optical assembly varied the optical path length in the OCT interferometer sample arm, but did not change the focal plane of the optical system. This path length variation enabled the OCT instrument to precisely measure the longitudinal position of the carriage assembly. The end caps of the capsule were also rapid-prototyped. The proximal end cap had internal channels for the micromotor electrical cable and pneumatic inflation tube, and an extruded slot for management of the electrical cable over the imaging region. The distal end cap had an L-channel for passage of air or other fluid. A glass tube with outer diameter 12.0 mm and inner diameter 10.0 mm was cut to length with a diamond saw from stock fused quartz tubing (Technical Glass Products, OH). Glass was chosen for high optical clarity and close dimensional tolerances of commercial manufacturing. Future iterations designed for human use would use medical grade plastics. An elastic nitrile bellows was cut to length and attached on one end to the distal end cap such that the bellows could be pneumatically inflated via the L-channel. Kink-resistant Tygon tubing with outer diameter 0.8 mm (US Plastic Corp, OH) was attached to the opening of the L-channel, folded back outside the glass tube and inserted into a channel on the proximal end cap.
A brushless DC micromotor with 4 mm diameter (Namiki Precision, CA) was mounted in the center of a rapid-prototyped carriage. The carriage was designed with 2 sets of 3 through holes spaced radially apart. The holes were tapped through for 2-56 thread non-marring set screws. The set screws were screwed into the through holes such that the tips emerged from the opposing side, producing 2 parallel sets of 3 radially distributed contact points. The protrusions of each set screw were carefully optimized such that the carriage was well centered in the glass tube and were then fixed in position with adhesive. The carriage was then affixed to the open end of the bellows. To check for leaks, the carriage-bellows assembly was fully immersed in water and the bellows inflated. Leaks were sealed with adhesive.
After the bellows and carriage were assembled, the motor was mounted in the carriage, and the motor cable passed through one of the unused threaded holes on the carriage. The motor shaft was fitted with a precision jewel bearing (Swiss Jewel, PA), which allowed accurate mounting of the lens holder on the central axis of the motor. The collimator was centered in a torque cable of outer diameter 1.9 mm (ACTONE, Asahi-Intecc, Japan) that was then centered in the proximal end cap. The torque cable was used to increase the tether rigidity and was not used for any internal actuation. The carriage was qualitatively evaluated for low friction actuation, radial or angular fit, and alignment with the collimated beam. The laser spot was verified to be circular by a beam profiler (DataRay Inc., CA). Both end caps were snap fitted with adhesive on each end of the glass tube. Proximal to the capsule, the optical fiber, motor cable and inflation tubing were organized into a bundle that was passed through an FEP AWG8 tubing (Zeus Industrial Products, SC), which was then affixed to the proximal end cap of the capsule. At the proximal end of the entire tether, the optical fiber and motor cable were connectorized. The inflation tube was connected to a female Luer lock fitting and 20 mL syringe containing air, which was actuated by a motorized linear translation stage modified to work as a syringe pump.

Scan actuation and characterization
The rotary speed of the motor was set according to the spot size, capsule circumference and axial scan rate. For a 26 μm (FWHM) focused spot and π × 12 mm ≈38 mm circumference, ~3000 axial scans per revolution were required for Nyquist sampling. At 1 MHz A-scan rate, this corresponded to a maximum rotary speed of 330 Hz. The motor was driven at 250 Hz, sampling at 1.3 times Nyquist, with a three-phase D/A output (National Instruments) to an audio amplifier. In the longitudinal scan direction, the Nyquist translation speed was ½ × 26 μm × 250 Hz = 3.3 mm/s. The syringe pump for the pneumatic actuation generated a maximum longitudinal carriage translation speed of ~1 mm/s, sampling at 3 times Nyquist. The longitudinal speed could be increased with a larger syringe or a faster translation stage. The capsule was hermetically sealed, therefore there was a pressure build up and significant infusion pressure was required to expand the bellows within the capsule. This limitation can be addressed by future redesigns. Figures 2(b) and 2(c) show the pneumatic actuator in the start and end positions. The varying path length during longitudinal scanning of the carriage could be used to track the precise position of the carriage. The actuator had an acceleration phase, a near constant-speed phase, and a deceleration phase. The longitudinal actuator could be improved in the future by using hydraulic actuation to reduce the non-uniform speed artifacts.
The longitudinal scan position and trajectory ( Fig. 3(a)) were obtained by automatically detecting a reflection from the glass tube (Fig. 5) that was present in all frames, and measuring the frame to frame longitudinal translation. A ramp offset was then added to account for the constant speed translation of the reference mirror, to obtain the actual carriage displacement. The noise in the trajectory plot was generated by micromotor vibration that produced small displacements in the longitudinal direction as well as segmentation errors. The compressibility of the air in the bellows resulted in the vibrations being undamped. A linearized trajectory was obtained by smoothing the measured trajectory with a moving average filter and fitting a line to the central, nearly linear portion. Before longitudinal distortion correction, the frames were circular-shifted to the same interferometric delay by using the reflection from the glass tube. Frames were then interpolated by cubic spline to the linearized trajectory to correct for longitudinal distortion from the nonlinear longitudinal actuation. For each pneumatically scanned data set, the longitudinal trajectory from that data set was obtained and used to correct distortions. All data post-processing was performed in MATLAB (MathWorks, MA). Figure 3(b) shows the filtered and linearized trajectory used for the spline resampling. Figures 3(c) and 3(d) show an image of rectal crypts, before and after longitudinal position correction, respectively. Both images appear similar in the central region corresponding to the pneumatic near-constant speed phase, but the regions corresponding to the acceleration and deceleration phases are distortion-corrected.

Animal imaging procedure and scan protocols
Imaging was performed under protocols approved by the Committee on Animal Care (CAC) at the Massachusetts Institute of Technology. Three Yorkshire swine of approximately 50 kg were imaged in two sessions with the same capsule device. Sedation in the swine was induced with intramuscular injection of 5 mg/kg telazol and 2 mg/kg xylazine, and atropine at 0.04 mg/kg was given to maintain heart rate after sedation and to control mucus secretions. For upper GI imaging, a 16.7 mm inner diameter overtube (US Endoscopy, OH) was slid over and up to the proximal end of an endoscope (Pentax Medical) before introduction into the animal. The endoscope was then introduced into the esophagus, the overtube slid over and into the esophagus from the distal end of the endoscope, and the endoscope withdrawn from the esophagus, leaving the overtube in place. The capsule was introduced directly into the overtube by advancing the semi-rigid tether. The overtube was then withdrawn from the esophagus and mouth, and slid up to the proximal end of the tether. Thereafter, the distal end of the probe could be longitudinally positioned by advancing or retracting the semi-rigid tether. For lower GI imaging, a tap water enema was administered to clear the rectum of stool. The capsule was then inserted via the anal canal and then advanced forward using the tether into the rectum. X-ray images (Hudson Digital Systems, NJ) were acquired to confirm positioning of the capsule in the upper and lower GI tract (Fig. 4). For 2D distal scanning, the capsule was positioned at an area of interest based on real-time cross-sectional OCT display, and left stationary while the pneumatic actuator performed the longitudinal scan. The data size of each pneumatically scanned acquisition was 960 samples × 4000 A-scans × 1750 frames for a 7 second acquisition over a longitudinal range of ~3.5 mm, corresponding to a total OCT scanned area of ~1.3 cm 2 . For manual actuation to obtain a large field of view, the pneumatic actuator was disabled while continuing distal rotary scanning. The entire capsule was pulled back or advanced manually at a speed of ~3 mm/s via the semirigid tether. The data size of each manually actuated acquisition was 960 samples × 4000 Ascans × 7000 frames for a 28 second acquisition over a longitudinal range of ~80 mm, corresponding to a total OCT scanned area of ~30 cm 2 . Figure 5 shows representative OCT cross-sections of the esophagus and rectum in polar and Cartesian coordinates. Cross-sectional OCT images (B-scans) were displayed in logarithmic grayscale. Each B-scan consisted of 4000 A-scans and was acquired at 250 frames per second. The squamous epithelium (e), lamina propria (lp), muscularis mucosa (mm), submucosa (s), and muscularis propria (mp) in the esophagus were visible, and OCT imaging depth was over 1 mm (Figs. 5(a) and 5(c)). In the rectum, the OCT cross-sectional images showed the characteristic columnar structure of crypts with vertical shadowing and increased OCT signal attenuation (Figs. 5(b) and 5(d)). The outer glass surface of the capsule was approximately index matched with the tissue and generated minimal reflections. The inner glass surface (white arrow) appeared as an aliased reflection, which was detected to obtain the longitudinal trajectory for pneumatic scans, as described in Section 2.3. The features which appear as discontinuities at the tissue surface were produced by the micromotor cable (aliased, blue arrow) and inflation tube (orange arrow).

Motion artifacts in the en face plane
In order to qualitatively assess the effects of cardiac motion and other possible motion artifacts, the capsule was positioned mid-esophagus and left stationary while images were acquired for 28 seconds with distal rotary scanning (no manual or pneumatic longitudinal actuation). Figure 6 shows a representative en face OCT image (4000 × 7000 pixels) from about 600 μm below the esophageal surface, projected over a 100 μm depth range. The motor cable (blue arrow) and inflation tube (orange arrow) appear as horizontal shadow artifacts. Notably, the motor cable artifact appears as a straight line. The motor cable was located inside the capsule, therefore it was not subject to tissue motion or perturbations relative to the micromotor. Any instability in the distal rotary scan would manifest as oscillations in the shadow artifact, but these were not visible at this scale. Therefore the image oscillations were not due to the distal rotary scan.
Along the time axis, there were about 40 oscillatory periods at higher frequency, and about 12 oscillations at lower frequency. The total acquisition time was 28 seconds. Therefore, the higher frequency was about 1.4 Hz, and the lower frequency was about 0.4 Hz. These frequencies correspond approximately to heart rate and respiratory rate respectively, as described in veterinary literature [40]. Both heartbeat and breathing produced significant motion in the esophagus due to its close proximity to the heart and lungs. The capsule appeared to be in continuous, periodic motion in the transverse plane. Motion in the longitudinal direction was harder to detect using the transverse scanning acquisition protocol.

En face OCT by manual scanning
En face images were displayed in square root grayscale. Ultrahigh speed OCT enabled the visualization of en face structural features, even using a relatively rapid and uncontrolled manual longitudinal actuation via the tether, and with significant motion of the esophagus. Each volumetric acquisition took 28 seconds and consisted of 7,000 frames. About 5% of the start of each image was cropped due to a brief delay between the start of data acquisition and the start of manual longitudinal scanning. Assuming a longitudinal actuation speed of ~3 mm/s, the pullback length was ~8 cm and the OCT field size was ~30 cm 2 .
Volumetric data was acquired by manually advancing the capsule from the proximal to distal esophagus. Figure 7 shows en face OCT views of the esophageal wall at different depths and projections. The full projection (mean over 600 μm depth) image ( Fig. 7(a)) produced a relatively low contrast image with some shadowing reminiscent of a vascular network (indicated by arrows), similar to an endoscopic image. Figure 7(b) shows an en face OCT image at ~200 μm below the surface and projected over a 100 μm depth range, where the lower and higher hyperscattering areas correspond to the epithelium and lamina propria layers, respectively. The varying contact and resultant compression of the esophageal wall on the capsule led to tissue layers being tilted relative to the capsule surface and thus spanning a range of depths. Figure 7(c) shows an en face OCT image at ~600 μm below the surface and projected over a 100 μm depth range, which shows high contrast features resembling a vascular network of large and small vessels.
En face OCT images were also obtained from the squamo-columnar junction ('dentate line') by manually advancing the capsule from the anal verge to the rectum. Figure 8 shows an en face OCT image at ~300 μm below the tissue surface and projected over a 50 μm depth range. The anal canal exhibited squamous epithelium and tissue folds, which transitioned to crypt structures in the rectum. Crypts were generally round and densely packed. Distortions from the regular crypt pattern in the longitudinal direction were due to varying speed of the manual insertion of the capsule.

En face imaging by 2D distal scanning
Volumetric OCT data was also acquired with the pneumatic actuator for longitudinal scanning in order to demonstrate precision 2D scanning over a smaller field of view. Each 3D OCT data set consisted of 1,750 frames over a ~3.5 mm longitudinal distance, acquired in 7 seconds. The scanned area was ~1.3 cm 2 . Since the squamous epithelium of the normal upper GI tract is relatively featureless in en face OCT, imaging was performed in the lower GI tract. Figure 9 shows an en face OCT image at the dentate line in the rectum, at ~400 μm from the surface and projected over a ~50 μm depth range. The motor cable was visible as an aliased artifact. The inflation tube was in a region of tissue non-contact, such that it sagged away from the glass surface (Fig. 2) and was not visible at this particular en face depth. The longitudinal position of the carriage was measured using OCT, and non-uniform scanning in the longitudinal direction was corrected as described in section 2.3. Oscillation artifacts in the rotary direction are visible when the en face image is zoomed. These oscillations are highly regular, suggesting that they are due to non-uniform rotational distortion (NURD) of the micromotor. The deviation in rotary position was about 50 μm as measured from the image, which over a circumference of 38 mm corresponded to an angular deviation of about 8 mrad. The NURD can be corrected with methods such as a fiducial-based scan correction technique reported by our group, which can reduce NURD by over an order of magnitude [37].

Discussion
The ability to accurately scan and image a larger circumferential region of luminal structures than previous small diameter probes is important for many endoscopic imaging applications. This study demonstrated a novel imaging capsule in living swine as a translational step towards human imaging and clinical studies. The capsule combines large field coverage (~30 cm 2 ) with proximal manual longitudinal actuation, and small field inspection (~1 cm 2 ) with precision distal pneumatic longitudinal actuation, for volumetric and en face imaging at microscopic resolution. These field sizes are orders of magnitude larger than existing endomicroscopy technologies such as magnification narrow band imaging with 1-2 mm 2 fields [41] and confocal laser endomicroscopy with <0.3 mm 2 fields [42]. In addition to large field of view imaging, the option to inspect a small field for fine structural anomalies could be important for real-time detection of dysplasias, which are known to be focal. The 2D distal scanning can generate undistorted en face features that, in future clinical studies, may resolve pit patterns which are markers of disease. The ultrahigh speed swept laser source was critical in order to image en face features over a large field. The device incorporates several technical innovations, many of which were made possible by the design flexibility of rapid prototyping. The availability and low cost of stereolithography suggests that even precision components with relatively small dimensions can be economically produced in prototype quantities or larger. Moreover, the resolution and dimensional accuracy was sufficient for parts to be used as optical mounts, making selfalignment of optical beams and bulk optics simple and accurate. The design can also be scaled to different focused spot sizes and transverse resolutions. Having the focusing optics perpendicular to the tissue surface, rather than reflecting 90 degrees as in previous generation proximally scanned catheters, allows higher numerical aperture focusing. The ability of the capsule to scale to high numerical aperture, combined with precision distal longitudinal scanning should enable these devices to be used for optical coherence microscopy (OCM) [43,44] or other scanning microscopy techniques. The primary challenge in designing the precision distal longitudinal scanning optical system was the requirement that the translating carriage had to remain aligned to the collimated beam over the full longitudinal travel. Careful adjustment of the carriage contacts to ensure optimized centration and alignment in the glass tube was important. The uniformity and optical clarity of glass tubing was advantageous; however, future clinical devices would use medical grade plastics.
The requirement of precise but slow scanning in the longitudinal direction is extremely challenging, because it is difficult to achieve precision at slow speeds. The pneumatic longitudinal actuator is a novel method for performing precise distal longitudinal scanning. It is cylindrically symmetric, simple, extremely compact when deflated, and capable of actuating at different speeds dependent on the proximal inflation. Using a bellows with optimized material and structural design, together with a shorter motor would further improve the longitudinal scan length. Expanding the bellows within the sealed capsule required a relatively high pressure. Also, the compressibility of air resulted in inefficient and nonlinear actuation. Use of hydraulic actuation with an incompressible fluid would enable more accurate and rapid longitudinal translation, even with a hermetically sealed capsule. An incompressible fluid should also reduce parasitic oscillatory motion of the carriage due to motor vibrations. Finally, it should also be noted that the inflation tube was located outside the capsule due to space constraints, but a custom designed plastic enclosure could have the inflation tube fabricated inside the capsule or within the capsule wall.
The ability to distally actuate longitudinally while simultaneously measuring the longitudinal position for post-acquisition volumetric data correction is important not only for applications such as en face OCT and OCTA, but also promises to enable high precision scanning microscopy modalities such as confocal fluorescence or multiphoton microscopy. Our group previously reported a fiducial based scan correction method for the fast rotary scan [37]. The present study demonstrated feasibility of correcting the slow longitudinal scan by measuring its trajectory using OCT. Translation of the mirror in the reference arm to vary the path length was required to maintain the OCT image within the limited imaging range. The long coherence length of the VCSEL light source supports long-range imaging with minimal sensitivity roll-off [45] and higher speed data acquisition cards of up to 4 GS/s are now commercially available, suggesting that imaging range could be increased in the future. Endoscopic OCTA using a small ~3 mm diameter micromotor probe compatible with the working channel of the endoscope [17] has been previously demonstrated, but the field of view is limited. Extending to wider fields of view provided by the capsule is challenging because OCTA requires precise and repeatable scanning in two dimensions in order to calculate scan-to-scan phase or intensity decorrelation. This is an important topic of continuing research and development.
The feasibility of the manual longitudinal actuation for generating en face images depends largely on the imaging speed of the OCT system. It is difficult to manually actuate a semirigid tether at a constant speed slower than a few millimeters per second, which would be required to achieve Nyquist sampling at limited axial scan rates. Anatomic variations and resultant tensional changes in the tether during the scan further exacerbate the difficulty in scanning smoothly. Length calibrated actuation using motorized translation stages, position sensors [46] or image-based tracking [47] would help to localize the capsule and/or control its speed. The availability of even higher speed swept source systems would relax the requirement for slow manual actuation, and even enable increased manual longitudinal scan speeds for more rapid volumetric data acquisition over large fields of view. Higher speeds are also important for functional OCTA imaging. Swept sources with multi-megahertz A-scan rates [2] would enable longitudinal actuation speeds of up to 1 cm/s, which is moderately fast and should be easier to perform than slower speeds.
In the upper GI tract, the capsule can cover virtually the entire length of the esophagus by proximal manual actuation. Dense optical sampling and en face imaging would enable mapping of the unwrapped esophageal surface to identify focal pathology. However, the generation of complete and unobscured en face views depends on contact of the full esophageal circumference with the capsule surface during the proximal actuation. Achieving uniform tissue contact was challenging in the sedated swine; in the esophagus, full circumferential contact occurred only occasionally, and en face OCT images often showed large areas of non-contact. Previous OCT capsule imaging studies in humans reported good contact, with 94% of acquired frames having more than 50% coverage when the capsule was swallowed by non-sedated, sitting patients [34]. This is likely due to the peristalsis produced when the patient swallows the capsule, causing the esophageal wall to contract around the capsule, while the peristaltic wave propels it downward. In a sedated animal model, peristalsis could not be induced, which may have resulted in highly variable tissue contact, exacerbated by the effects of cardiac motion. It is likely that swallow-induced peristalsis in either conscious or moderately sedated human patients will improve tissue-capsule contact.
The capsule could be clinically applicable to not only upper GI screening in unsedated patients, but also as an adjunct to the endoscope during standard upper GI endoscopy. Previous imaging studies in unsedated patients used a highly flexible tether to optimize patient comfort [32,33]. For imaging during standard endoscopy, the tether should be semirigid, allowing arbitrary longitudinal positioning and actuation regardless of esophageal motility or gravitational force. In the swine study, the semi-rigid tether enabled large field of view imaging by pulling back or advancing the capsule, as well as small field of view imaging using precision distal pneumatic scanning while maintaining the capsule's orientation and longitudinal position. Although swallowing could not be induced in the swine and thus an overtube was required to introduce the capsule, a moderately sedated patient is able to swallow to open the upper esophageal sphincter for introduction of the capsule, as is done for an endoscope. After introduction of the capsule, the small-diameter tether remaining in the patient's mouth and throat should be well tolerated and more comfortable than a larger diameter endoscope. We have performed preliminary clinical studies in patients which suggest that the capsule can be introduced independently into the esophagus prior to introduction of the endoscope and obtain good tissue contact and en face imaging. It should be noted that the tether flexibility can be changed, independent of the distal device design, enabling variations of this device to be used for multiple endoscopic and possible intraoperative imaging applications such as surgical cavities. The semi-rigid tether might also be used in a non-sedation setting; there are ultrathin (<6 mm diameter) endoscopes and capsule catheters used transorally without sedation [48, 49] that have semi-rigid tethers. The ability to perform capsule OCT imaging as an adjunct to endoscopy suggests applications for surveillance as well as targeting of specific areas for ablative therapies based on real-time image guidance.
In the lower GI tract, the ability to position the imaging device is particularly critical since peristalsis cannot be used as in the upper GI tract. In the swine study, there was minimal motion in the lower GI tract, enabling high resolution en face visualization of surface pit patterns, although circumferential contact was variable and dependent on lumen size. Tissue contact was good in the anal canal and slightly worse in the rectum. The ease of introduction into the rectum suggests that the capsule could be used for non-endoscopic evaluation and other clinical studies of anal and rectal cancers, inflammation, and vascular pathologies [50]. For imaging further up in the transverse and ascending colon, the capsule could be carried on the distal end of the endoscope with a band or mount. The capsule would not be able to image the full luminal circumference of the colon, but when placed in contact with a region of interest would image a much larger angular field than a small diameter probe. Distal pneumatic longitudinal actuation is particularly well suited for this application; once the capsule is positioned in contact with a region of interest, the distal scan can cover a field of view that is larger than magnification endoscopy or confocal laser endomicroscopy.