Low-cost, chromatic confocal endomicroscope for cellular imaging in vivo.

We have developed a low-cost, chromatic confocal endomicroscope (CCE) that can image a cross-section of the tissue at cellular resolution. In CCE, a custom miniature objective lens was used to focus different wavelengths into different tissue depths. Therefore, each tissue depth was encoded with the wavelength. A custom miniature spectrometer was used to spectrally-disperse light reflected from the tissue and generate cross-sectional confocal images. The CCE prototype had a diameter of 9.5 mm and a length of 68 mm. Measured resolution was high, 2 µm and 4 µm for lateral and axial directions, respectively. Effective field size was 468 µm. Preliminary results showed that CCE can visualize cellular details from cross-sections of the tissue in vivo down to the tissue depth of 100 µm.


Introduction
Cervical cancer is one of the most prevalent cancers in women living in low-and middle-income countries (LMICs) [1]. Histopathological analysis of the cervical biopsy is the standard of care diagnostic approach in developed countries. However, histopathological analysis can often be challenging in LMICs due to the scarcity of pathology labs and trained personnel. Even when a pathology lab is accessible in LMICs, histopathologic diagnosis typically takes several weeks. This time delay poses challenges in initiating adequate treatment especially when patients living in rural areas need to make multiple trips to an urban hospital. Low-cost diagnostic approaches such as visual inspection with acetic acid (VIA) [2] and HPV test [3,4] can aid early diagnosis and treatment of cervical malignancy in LMICs. However, these low-cost diagnostic approaches have been reported to provide low specificity, which can lead to unnecessary treatment of women with benign conditions [5,6]. Therefore, there is an unmet need for a low-cost diagnostic tool that can provide both high diagnostic sensitivity and specificity.
Confocal endomicroscopy can directly visualize cellular morphologic features from the intact human tissue without biopsy and histopathology. Previously, various confocal endomicroscopy devices have been developed and successfully evaluated for imaging a wide range of human tissues, including oral mucosa [7][8][9], ovarian tissue [10], colonic mucosa [11], esophageal tissue [12], and glioma tissue [13]. For diagnosis of cervical malignancy, confocal microscopy was shown to reveal disease-associated cellular morphologic changes and provide high diagnostic sensitivity (100% with reflectance contrast [14] and 100% with fluorescence contrast [15]) and high specificity (100% with reflectance contrast [14] and 92% with fluorescence contrast [15]).Therefore, if confocal microscopy were available as a low-cost imaging tool, it could aid accurate diagnosis and timely treatment of cervical malignancy in LMICs. However, widespread adoption of confocal microscopy in LMICs has not occurred yet mainly due to the high device cost, > $50,000. High-resolution micro-endoscopy (HRME) greatly reduced the device cost (material cost < $5,000) by integrating a fiber bundle and fluorescence microscope in a portable form factor [16]. HRME demonstrated promising sensitivity of 84-97% and specificity of 54-74% from several clinical studies conducted in LMICs [17,18]. However, HRME was reported to face challenges when imaging tissues with high nuclear density due to the lack of optical sectioning capability [19]. Low-cost HRME devices with confocal optical sectioning capability have been recently developed [20][21][22], but the achieved resolution (axial half-width-half-maximum = ∼70 µm [20]) was not as high as resolution of high-cost confocal microscopes (axial full-width-halfmaximum (FWHM) = 2-5 µm) that demonstrated high diagnostic sensitivity and specificity or resolution of previously-developed confocal endomicroscopy devices (axial FWHM = 2-10 µm) [23][24][25].
We have previously developed a low-cost, smartphone-based confocal microscope [26] and demonstrated confocal imaging of human skin in vivo in Uganda [27]. In the smartphone confocal microscope, a diffraction grating was used to focus different wavelengths into different lateral locations of the tissue. This approach obviates the need for expensive beam-scanning devices and their associated electronics. While our previous smartphone confocal microscope achieved high resolution (axial FWHM = 5 µm), comparable to the resolution of high-cost confocal microscopes, its direct application for imaging cervix in vivo has two major challenges. First, the device needs to be modified as an endoscopic probe for introduction through the vaginal canal. Second, the smartphone confocal microscope generates en face images, visualizing a single tissue depth at a time. The cervical epithelium has varying cellular morphology as a function of tissue depth. Therefore, an en face-imaging microscopy device needs to use a precision axial scanning mechanism for three-dimensional imaging. A precision axial scanning mechanism can be difficult to incorporate in a low-cost, endoscopic imaging device because 1) patient or device motion during axial scanning hampers accurate depth registration, and 2) use of the axial scanning mechanism increases the device cost and complexity.
Chromatic confocal microscopy is a promising approach to image multiple tissue depths in a single image. In this approach, the chromatic focal shift is used to focus different wavelengths into different depths. In a previous chromatic confocal microscope, relay optics composed of four off-the-shelf aspheric lenses were used to generate chromatic focal shift [28]. This chromatic confocal microscope achieved high resolution (axial FWHM = 3 µm) over an imaging depth range of > 150 µm and demonstrated cellular imaging capability. However, an expensive pulsed laser and high-magnification objective lens need to be used, which poses challenges in implementing this microscope as a low-cost, endoscopic imaging device. In another previous work, a gradient index (GRIN) lens was used to construct a chromatic confocal microendoscope [29]. The diameter of the GRIN lens was small, 1 mm, which makes this microendoscope promising for imaging the cervix. However, the imaging depth range was relatively small, 40 µm, the axial resolution was moderate, 13-21 µm, and cellular imaging capability was not demonstrated.
In this paper, we present development of a low-cost, chromatic confocal endomicroscope (CCE) for imaging a cross-section of the tissue at cellular resolution. We optimally designed a custom objective lens to cover a sufficiently large imaging depth range, while achieving high resolution. We also developed a miniature spectrometer to spectrally-disperse light returning from the tissue and generate cross-sectional confocal images. Details of the CCE prototype are described, including designs of the custom objective lens and miniature spectrometer. Preliminary results from the resolution measurement and tissue imaging experiment are presented. Figure 1 shows a schematic of CCE. In CCE, an inexpensive LED (CMA1303-0000-000C0U0A27G, Cree Inc.; color temperature = 2700 K; emission diameter = 4.5 mm; viewing angle = 120°; unit price = $6) was used as the light source. The power density of the LED was not uniform across the spectral band of 500-700 nm used in CCE. At 500nm and 700nm, the power density was around 30% of the power density at the peak wavelength, 610 nm. A multimode fiber (TC-1500-22, Asahi Kasei; core diameter = 1.5 mm; cladding diameter = 1.55 mm; numerical aperture (NA) = 0.5) was used to deliver light from the LED to a custom illumination slit (width = 4 µm; length = 1.4 mm). A relatively large fiber core diameter of 1.5 mm was used to illuminate the entire length of the slit, 1.4mm. Light from the fiber was directly coupled to the illumination slit without any optical element in between since the core diameter and NA of the fiber were larger than the slit length and the effective NA used after the illumination slit, 0.11. Light from the slit was collimated using an achromatic doublet (45-262, Edmund; clear aperture = 2.7 mm; focal length = 12 mm). Different wavelengths of the collimated light were focused into different tissue depths by a custom objective lens (focal length = 4.8 mm; NA = 0.68). Details of the custom objective lens are discussed in the following section. The illumination beam was off-center from the optical axis of the objective lens by 1.625 mm. Diameter of the illumination beam was 2.7 mm, which resulted in an effective illumination NA of 0.28. Light reflected from the tissue was collected by the same objective lens and was focused by another achromatic doublet (45-262, Edmund; clear aperture = 2.7 mm; focal length = 12 mm) onto a custom detection slit (width = 4 µm; length = 1.4 mm) to achieve confocal optical sectioning. The slit width was 1.47 times as large as the FWHM of the Airy disk on the detection slit for 600 nm. CCE used a divided pupil approach, where the illumination path and the detection path used different regions of the objective lens pupil. The divided pupil approach provides increased image contrast [30] and decreases detection of specular reflection from the objective lens. A custom miniature spectrometer was used to spectrally-disperse the detection beam and generate a cross-sectional confocal image on a miniature CMOS sensor (XD-B209M3-01, Misumi Electronics; pixel size = 1.4 µm; number of pixels = 1280 × 720). Details of the spectrometer are provided in the following section. Image data from the CMOS sensor was transferred to a consumer-grade laptop (Surface Book, Microsoft) or a smartphone (Samsung Galaxy S8+) via a USB 2.0 cable. Mechanical holders were designed and fabricated using 3d printers (Form 2 and Form 3, Formlabs). Most of the optical elements were aligned passively using tight fit with the mechanical holders, but some alignments were completed actively while optimizing optical performance.

Custom objective lens
The custom objective lens was composed of an aspheric singlet and a plano-convex lens ( Fig. 2(a)). Lens materials with low Abbe numbers were chosen to induce large chromatic focal shift: S-LAH60 (nd = 1.834; vd = 37.16) for the aspheric singlet, and N-SF11 (nd=1.785; vd=25.68) for the plano-convex lens. The aspheric singlet had an aspheric surface on one side and a nearly flat spherical surface on the other side. The glass molding approach was used to fabricate aspheric singlets. The aspheric singlet had a clear aperture of 6.5 mm and a center thickness of 2.2 mm. The plano-convex lens was fabricated by glass polishing. The plano-convex lens had a clear aperture of 3.0 mm and a center thickness of 2.0 mm. The objective lens had an effective focal length of 4.8 mm and an NA of 0.68. The objective lens was optimized for water immersion. The source spectrum of 500-700nm was dispersed over an imaging depth range of 110 µm: 500 nm, 600 nm, and 700 nm focused at the tissue depth of 15 µm, 80 µm, and 125 µm, respectively (inset, Fig. 2(a)). Figure 2(b) shows RMS wavefront error as a function of field height on the tissue plane. At the central wavelength of 600nm, the objective lens had a diffraction-limited performance over a field size of 560 µm. We performed a Monte-Carlo tolerance analysis on the objective lens based on the lens manufacturing tolerances. The tolerance analysis showed that more than 90% of the objective lenses would provide diffraction-limited performance.
We simulated confocal point-spread function (PSF) to estimate the axial and lateral resolution of CCE. The confocal PSF was calculated by multiplying the illumination PSF with detection PSF similarly to dual-axis confocal microscopy [30,31]. First, a diffraction-limited PSF was generated for an effective NA of 0.28 and uniform illumination using an ImageJ plugin PSFGenerator [32]. The diffraction-limited PSF was then rotated within the xz-plane ( Fig. 2(a)) by 14.7°, the illumination angle, and convolved with the demagnified image of the illumination slit on the tissue. Similarly, detection PSF was generated by rotating the diffraction-limited PSF within the xz-plane by -14.7°followed by the convolution with the detection slit image on the tissue plane. The resulting confocal PSF at 600 nm had lateral (x) and axial (z) FWHM values of 1.24 µm and 4.44 µm, respectively. Lateral resolution along the slit length direction (y-axis in Fig. 2(a)) was determined by the diffraction-limited spot size based on the detection NA and wavelength since the tissue was illuminated over a line along the y-axis. The theoretical lateral resolution along the y-axis at 600 nm was calculated as 1.09 µm.
Contrary to conventional line-scan confocal microscopy, CCE was expected to have better lateral resolution along the slit length direction (y) than the slit width direction (x). This is due to the fact that both illumination and detection slit widths were larger than the lateral FWHM of the focal point: the de-magnified image of the slit width on the tissue was 1.6 µm, while FWHM of the Airy disk for 600 nm and 0.28 NA was 1.09 µm. This also made the resolution not linearly related to the wavelength and not diffraction-limited: the lateral (x) FWHM varied between 1.24 to 1.27 µm and axial (z) FWHM between 4.19 and 4.82 µm for the wavelength range of 500 to 700 nm. Figure 3(a) shows a schematic of the miniature spectrometer. Light filtered by the detection slit was collimated by an achromatic doublet (45-262, Edmund; clear aperture = 2.7 mm; focal length = 12 mm) and then diffracted by a grating-prism (GRISM). A blazed grating with the groove density of 150 lines/mm and a blazed angle of 8.6°(34003FL07-760R, Newport) was used to make the GRISM. The blazed grating provided high 1 st -order diffraction efficiency, 70-80% for 500-700 nm. The incidence angle on the grating surface of the GRISM was set as 10°, which made the 1 st -order diffraction beam at 600 nm propagate parallel to the optical axis of the spectrometer [33]. The diffracted light was focused on a CMOS sensor by another achromatic doublet (45-262, Edmund; clear aperture = 2.7 mm; focal length = 12 mm). A custom plano-concave lens (material = N-BK7; focal length = -3.5mm) was placed in front of the CMOS sensor to reduce the field curvature. Figure 3(b) shows RMS wavefront error as a function of field height on the CMOS sensor. The spectrometer provided diffraction-limited performance over a full field size of 1.75 mm. The expected field size on the tissue of 560 µm and theoretical lateral resolution along the slit length direction of 1.09 µm resulted in 514 resolvable points along the lateral direction. The 560-µm lateral field on the tissue was imaged to a 1.75-mm width on the CMOS sensor, which was then sampled by 1,250 pixels. Similarly, the expected axial imaging depth range of 110 µm and theoretical axial resolution of 4.44 µm resulted in 25 resolvable points along the axial direction. The GRISM groove density of 150 lines/mm, beam diameter of 2.7 mm, bandwidth of 200 nm, and central wavelength of 600 nm resulted in 135 resolvable points for the spectrometer [34], which was significantly larger than the number of resolvable points determined by the chromatic focal shift and axial resolution. The spectral band of 500-700 nm was mapped to a 0.48-mm height on the CMOS sensor, which was then sampled by 343 pixels. Therefore, Nyquist sampling criterion was met for both lateral and axial directions.

Fabrication of slits and GRISM
We fabricated custom, miniature slits and GRISM used in CCE. Multiple slits were fabricated on a cover glass (size = 25 × 25 mm 2 ; thickness = 170 µm). First, a thin aluminum layer (thickness = 300 nm) was deposited on the cover glass using an e-beam evaporator (FC 2500, Temescal). Second, a thin photoresist layer was spin-coated on top of the aluminum layer. Slit patterns (number of slit lines = 10; width of each slit = 4 µm; length of each slit line = 25 mm) were exposed on the photoresistor layer using a mask-less aligner (MLA150, Heidelberg Instruments). The aluminum layer was etched using an aluminum etching solution (Transene Company, Inc.), and the photoresist layer was removed by acetone and isopropyl alcohol. The slit-patterned cover glass was diced into small segments (size of each segment = 2 mm × 2.5 mm) using a dicing machine (DAD-320, Disco Co.). Approximately 50 slits were made from a cover glass with a size of 25 × 25 mm 2 . Electron microscopy image of the slit is shown in Fig. 4(a). Measured width of the slit was 3.78 µm, close to the design slit width. Optical microscopy image of the slit with trans illumination is shown in Fig. 4(b). The unetched aluminum layer blocked the illumination light down to the level of the microscope camera noise, which indicated that the thin aluminum layer was sufficient in providing high optical contrast between the slit opening and blocking areas. The custom, miniature GRISM was fabricated by core-drilling the bulk grating and polishing the non-grating surface. The bulk grating (size = 25 × 25 mm 2 ) was mounted on a slide glass using wax (CrystalBond 309, SPI supplies) with the grating surface facing the slide glass. The bulk grating was then placed under a core drill bit (diamond-coated; inner diameter = 3.3 mm) with a 10°tilt angle. The non-grating surface of the core-drilled grating was polished. A total of 25 GRISMs were fabricated from a bulk grating. Photos of the fabricated GRISM are shown in Fig. 5. Measured clear aperture of the GRISM was typically larger than 2.7 mm in diameter. Surface roughness of the polished surface of the GRISM was measured by a white light interferometer (Newview 8300, Zygo) and was typically smaller than 0.06 µm peak-to-valley for the 2.7-mm clear aperture. This surface roughness corresponded to the peak-to-valley wavefront error of 0.03 µm through the GRISM, which was significantly smaller than the diffraction-limited performance criterion at 600 nm, 0.15 µm.

Image processing
Along the lateral direction, a 560-µm lateral field of the tissue was mapped to 1,250 pixels of the CMOS sensor. Therefore, each pixel along the lateral direction represented 0.448 µm on the tissue. Along the axial direction, however, the tissue depth was not linearly mapped to the CMOS sensor due to the non-linear relation between the focal shift and wavelength. This made the top portion of the image appear compressed (for shorter wavelengths) and the bottom portion Flow of the image processing code is shown in Fig. 6. We have used a theoretical model of the CCE optics, rather than using experimental data of imaging a mirror at different depths, to estimate the relation between the tissue depth and pixel index on raw image data. Short working distance of the CCE device made it challenging to accurately measure the absolute distance between the lens bottom surface and mirror. The image processing code based on the theoretical model was later tested using experimental data. Relation between the wavelength and tissue depth was simulated using the ZEMAX model of the objective lens and focusing lens. Relation between the wavelength and y-index on the raw image data (y raw ) was simulated using the ZEMAX model of the spectrometer. These two simulation results were used to generate a curve between the depth and y raw . The depth that corresponded to each y-index of the corrected image (y corr ) was calculated, and the y-index in the raw image data (y raw ) associated with that depth was subsequently calculated. The intensity profile at y raw was then calculated by linearly interpolating neighboring two rows in the raw image. This intensity profile was used to fill y corr in the corrected image. The process was repeated until all rows of the corrected image were filled. The image processing code was applied off-line after the image acquisition was completed. The image processing code was tested on an image of a thin fluorinated ethylene propylene (FEP) film (refractive index = 1.34; nominal thickness = 75 µm). The measured thickness of the film in the corrected image was 72 µm. This measured thickness closely matched the nominal thickness, and the difference between the measured and nominal thicknesses, 3 µm, was within the film manufacturing tolerance. We also evaluated accuracy of the image processing code for measuring small displacement of the sample. We acquired a stack of CCE images of a mirror while the CCE device was axially scanned relative to the mirror at the speed of 9.9 µm/sec. Effective CCE imaging speed was 10.9 frames/sec, which resulted in the axial step size of 0.91 µm. Minimum incremental step size of the motor (Z812B, Thorlabs) was 0.05 µm. CCE images were corrected using the image processing code, and the axial step size between two subsequent images was calculated. Average of the measured step size values was 0.95 µm, and standard deviation was 0.14 µm. Difference between the measured axial step size and input step size to the motor was small, 0.04 µm.

Performance test
Lateral and axial resolution was measured by imaging a Ronchi grating (frequency = 10 lines/mm). Water was placed between the Ronchi grating and objective lens. Lateral resolution was calculated by measuring the FWHM of the line spread function (LSF) along the horizontal direction in the CCE image, which was also the slit length direction. Lateral resolution was mainly measured along the slit length direction because this resolution determined the cellular appearance in the CCE image. Lateral resolution along the slit width direction determined width of the thin cross section imaged by CCE. Axial resolution was calculated by measuring the vertical FWHM of the axial response, which was the intensity profile along the vertical direction around the reflective part of the grating. Exposure time of the CMOS sensor was manually adjusted to avoid pixel intensity saturation. Lateral and axial resolution was measured at imaging depths of 50 µm, 75 µm, 100 µm, and 125 µm. A USAF resolution target was also imaged to analyze the lateral resolution. The USAF resolution target was placed at the imaging depth of 75 µm. A motorized stage was used to laterally scan the USAF resolution target perpendicular to the cross-sectional field of CCE. Resulting stack of CCE images was resliced to generate a stack of en face confocal images acquired at multiple imaging depths.
Tissue imaging performance was tested by imaging human finger and lower lip in vivo. CCE images were compared with confocal images obtained with a portable confocal microscope previously developed for imaging human skin in vivo. The portable confocal microscope had a lateral resolution of 1.6 µm and axial resolution of 6 µm [35] and was shown to visualize characteristic cellular details of various skin lesions [36]. Figure 7(a) shows a photo of the CCE prototype. Overall diameter and length of the CCE prototype were 9.5 and 68 mm, respectively. Optical power incident on the tissue was small, 6.7 µW. List of components and their cost is shown in Table 1. Fabrication of ∼50 slits on a cover glass required two-hour use of the micro/nano fabrication cleanroom at the College of Optical Sciences, University of Arizona, which resulted in a batch cost of $70 or unit cost of $1.40 per slit. Cost of a bulk grating used for the GRISM fabrication was $310, which resulted in a unit material cost of $12.40 per GRISM. Total material cost of CCE was less than $1,500. The CCE prototype was further housed inside a stainless-steel tubing (outer diameter = 11 mm; length = 30 cm) and connected to a custom smartphone colposcope (Fig. 7(b)). Distal end of the CCE was sealed water-tight using a biocompatible epoxy and tested for immersion under 0.1% chlorine solution for disinfection. CCE image data were displayed and stored on the smartphone using a custom app (Fig. 7(c)).

Resolution measurement
Representative LSF curves at the center of the field for different imaging depths are shown in Fig. 8(a). Each curve was normalized by the area under the curve. At depths of 50 and 75 µm (green and yellow lines, respectively), the LSF curves had minimal side lobes: side lobe height was lower than 10% of the main lobe height. At depth of 100 µm (red solid line), the FWHM of the main lobe became smaller, but side lobes became more prominent. At depth of 125 µm (red Fig. 7. Photos of the CCE prototype (A), its connection to the smartphone colposcope (B), and smartphone app for CCE image data acquisition (C). dotted line), instead of having an obvious main lobe, two lobes with comparable heights were noticed, which indicated degradation of lateral resolution. Measured FWHM values at different depths are shown in Fig. 8(b). FWHM value at the center of the field (solid line Fig. 8(b)) varied from 1.75 µm, 2.19 µm, 1.60 µm to 5.46 µm as the imaging depth changed from 50 µm to 125 µm. This result indicated that lateral resolution of ∼2 µm was achieved at least down to the depth of 100 µm. Lateral resolution at 85% of the field height (dotted line, Fig. 8(b)) varied from 2.14 µm, 2.16 µm, 1.92 µm to 3.98 µm for the imaging depth from 50 µm to 125 µm and was similar to the resolution at the center of the field. Similar to the LSF curves at the center of the field, the LSF curves at 100 µm and 125 µm showed increase of side lobe height. A re-sliced en face image of the USAF resolution target located at the depth of 75 µm is shown in Fig. 8(c). Lines on group 9, element 1 (line period = 1.96 µm) were readily distinguished along both slit length and slit width directions. The lateral FWHM along the slit width direction was 1.97 ± 0.15 µm. Representative axial response curves at the center of the field for different imaging depths are shown in Fig. 9(a). Each curve was normalized by the area under the curve. The axial response curve at 100 µm (red solid line) showed a narrower FWHM of the main lobe than those at 50 and 75 µm, but broadening of the base was noticed. Broadening of the main lobe was noticed at 125 µm. Measured FWHM values at the center of the field at different depths are shown as the solid line in Fig. 9(b). The axial FWHM values at the center of the field were 4.16 µm, 4.17 µm, 3.61 µm, and 5.94 µm for depths of 50 µm, 75 µm, 100 µm, and 125 µm, respectively. The axial resolution was maintained high, ∼4 µm, at least down to the depth of 100 µm. The axial FWHM values at 85% of the field height (dotted line, Fig. 9(b)) were 5.20 µm, 4.51 µm, 2.80 µm, and 4.35 µm for depths ranging from 50 µm to 125 µm. Broadening of the axial response curve base was noticed for 100 µm and 125 µm at 85% of the field height.

Tissue imaging performance
During tissue imaging in vivo, exposure time of the CMOS sensor was increased to 0.25 seconds due to the low signal level. Corresponding imaging speed was 4 frames/sec. Confocal images of human finger in vivo obtained with CCE and portable confocal microscope are shown in Fig. 10 and Dataset 1 (Ref. [37]). In the cross-sectional CCE image ( Fig. 10(a)), keratinized epidermal cells are shown in stellate shape similarly to cross-sectional and en face portable confocal microscopy images (Figs. 10(b) and (c)). The effective field size was 468 µm, 16% smaller than the theoretical field size. This discrepancy was caused by a slight angle of the  polished surface of the GRISM relative to the optical axis of the spectrometer, which laterally shifted the confocal image on the CMOS sensor. Confocal images of the human lower lip in vivo obtained with CCE and portable confocal microscope are shown in Fig. 11 and Dataset 2 (Ref. [38]). In the CCE image ( Fig. 11(a)), squamous epithelial cell nuclei (nuclear diameter = ∼8 µm; cell-to-cell distance = 15-38 µm) are clearly visible with a similar size and density to those shown in the portable confocal microscopy image (Fig. 11(b)). This result shows potential for CCE to visualize cervical epithelial cell nuclei (cell nuclear diameter = 6-8 µm [15]; cell-to-cell distance = 27-35 µm [14]). For both finger and lower lip images, cellular features were observable for an imaging depth range of ∼100 µm. Cellular features shown in CCE images were similar to those previously visualized by the commercial confocal microscopy device (Vivascope 3000, Caliber ID), Fig. 6.1f in [39] and Fig. 3(b) in [7].

Discussion
We have developed a low-cost, chromatic confocal endomicroscope that can visualize cellular details from a cross-section of the tissue in vivo. We optimally designed the miniature objective lens to map different wavelengths to different depths, which uniquely enabled cross-sectional, cellular imaging in a small endoscopy device. The CCE prototype achieved lateral resolution of 2 µm and axial resolution of 4 µm down to the imaging depth of 100 µm. The CCE prototype was able to visualize cellular features at different depths in a similar manner to the portable confocal microscope and commercial confocal microscope. The depth resolving capability will be useful in examining cellular morphologic changes as a function of tissue depth when imaging the cervical epithelium. One of the advantages of CCE is that confocal image data is transferred via a standard USB 2.0 cable, which makes it possible to use a low-cost mobile platform such as a smartphone to acquire, display, and store the confocal image data. These preliminary results indicate that CCE might have the potential to visualize cellular morphologic changes associated with cervical malignancies in LMICs.
Several limitations of the CCE prototype were also noticed. Cellular details were not readily observable beyond the imaging depth of 100 µm. This limitation is primarily due to the reduced power density of the LED at longer wavelengths, degraded resolution, and light attenuation by the tissue. However, the imaging depth range of 100 µm is not likely to hamper CCE's capability to accurately diagnose cervical malignancy. Previous research of using confocal microscopy on cervical tissues indicated that confocal imaging at a depth range of 0 to 50 µm could reveal cellular morphologic differences between normal/benign tissues and malignant tissues [7,9].
Even within the effective imaging depth range of 100 µm, signal level varied as a function of imaging depth because i) the LED spectral density and GRISM diffraction efficiency were not uniform over the spectrum used, ii) reflectivity of the cellular feature changed as a function of wavelength, and iii) light absorption and scattering by the tissue decreased the illumination intensity with the increasing depth. This resulted in lower signal-to-noise ratio (SNR) at larger imaging depths. In order to mitigate this problem, multiple images can be acquired at the same tissue location but with different exposure times, and merged into a single image with high SNR similarly to the high-dynamic range imaging used in photography [40].
Reflectivity of each cellular component varied as a function of wavelength. While the reflectivity variation changes the absolute signal level at each wavelength, relative contrast of key cellular feature against its surrounding might stay similar over a wide range of wavelengths (e.g. epithelial cell nuclei appearing bright relative to cytoplasm). Our preliminary tissue CCE images did not show significant cellular contrast changes over the imaging depth of 100 µm (corresponding spectral band = 500-642 nm). A similar trend of consistent cellular contrast over the imaging depth range of >150 µm (corresponding spectral band = 590-770 nm) was observed in a previous chromatic confocal microscope [28]. However, more quantitative analysis will be needed for measuring the cellular contrast as a function of imaging depth in future studies of imaging cervix with CCE.
Discrepancy between the measured and theoretical resolution was noticed. The measured lateral resolution was poorer than the theoretical value, and the lateral and axial resolution curves at deeper imaging depths had significant artifacts. In addition, while the simulation results indicated that the lateral resolution would be better along the slit length direction than the slit width direction, the measured lateral resolution was comparable between the slit length and width directions. In order to find causes of the resolution degradation, we measured resolution of the objective lens alone and found that the measured objective lens resolution was 41% larger than the theoretical value. We also measured resolution of the spectrometer alone, and the measured spectrometer resolution was 43% larger than the theoretical value. These resolution discrepancies were mainly caused by the alignment errors and shrinkage of epoxy during curing. While the current lateral and axial resolution of CCE still allowed for cellular imaging, we will further refine the alignment process by incorporating more degrees of freedom and using an epoxy with less shrinkage to achieve resolution closer to theoretical values.
Imaging speed of the CCE prototype was limited to 4 frames/sec when imaging tissues. Low illumination power, 6.7 µW, mandated use of a long exposure time. The CCE prototype is intended to make direct contact with the tissue. During a clinical study of imaging human cervix in vivo, the clinician will hold the CCE prototype on a suspicious location and acquire still images rather than continuously moving the CCE prototype to acquire videos. If motion blur makes a large portion of the CCE images difficult to interpret, we will evaluate feasibility of using a deep learning-based denoising method on CCE images obtained with a shorter exposure time as recently was demonstrated in portable confocal microscopy [41]. In future prototypes, we will explore use of a wider slit width, which can increase the signal level and in turn increases the imaging speed, albeit compromising the resolution. Since the current CCE prototype oversamples along the vertical direction, a future CCE prototype can use a GRISM with a lower grove density to generate the CCE image over a smaller CMOS sensor height. This will increase the signal level for each pixel and help increase the imaging speed.
While the image processing code for correcting the depth scale of the CCE image appeared to agree well with the experimental data, its accuracy might differ when a tissue is imaged since the refractive index and dispersion of the tissue are different from those of water. Using the ZEMAX model of the objective lens, we evaluated the imaging depth change due to the change of the material from water to a human epithelial tissue with a refractive index of 1.36 and Abbe number of 45 [42]. The material change resulted in the imaging depth change of 1.8% for the nominal imaging depth of 125 µm. While this depth change is relatively small, effects of the tissue dispersion will need to be further studied by imaging a thin slab of excised tissue and analyzing corrected CCE images in comparison with images obtained with other methods such as optical coherence tomography and histology.
The base material cost of the CCE prototype was less than $1,500, but this cost did not include the cost and time required for an optical engineer to assemble the CCE prototype. Therefore, actual cost for fabricating this initial prototype was significantly higher than $1,500. However, we expect that future commercial CCE devices can be fabricated at low cost by mass-producing custom optical components and automating active alignment steps similarly to fabrication of consumer mobile devices. In addition, a miniature LED with sufficiently high brightness could be integrated into the distal part of the CCE device. This will obviate the need for a large-core optical fiber, which in turn will make the connection between the CCE device and smartphone colposcope more flexible and facilitate easy maneuvering of the CCE device.