In vivo imaging of cervical precancer using a low-cost and easy-to-use confocal microendoscope

: Cervical cancer incidence and mortality rates remain high in medically underserved areas. In this study, we present a low-cost ( < $5,000), portable and user-friendly confocal microendoscope, and we report on its clinical use to image precancerous lesions in the cervix. The confocal microendoscope employs digital apertures on a digital light projector and a CMOS sensor to implement line-scanning confocal imaging. Leveraging its versatile programmability, we describe an automated aperture alignment algorithm to ensure clinical ease-of-use and to facilitate technology dissemination in low-resource settings. Imaging performance is then evaluated in ex vivo and in vivo pilot studies; results demonstrate that the confocal microendoscope can enhance visualization of nuclear morphology, contributing to signiﬁcantly improved recognition of clinically important features for detection of cervical precancer. preliminary results show that it has great potential to improve real-time characterization of nuclear morphology in cervical lesions, and future studies are warranted to further evaluate its in a population, with automated diagnostic algorithms.


Introduction
Cervical cancer imposes an enormous burden on healthcare worldwide. Despite the effectiveness of screening programs in several high-income countries, cervical cancer has the second highest incidence and mortality rates among women in lower Human Development Index (HDI) areas [1][2][3]. Mirroring this disparity, medically underserved areas in the US, such as the Rio Grande Valley in Texas, also suffer from higher rates of cervical cancer compared to the rest of the country [4]. The geographic inequality is primarily attributed to limited access to screening and follow-up diagnostic and treatment services in these areas [5]. While routine screening in high-income settings has been shown effective in reducing mortality, there is a critical need to improve early detection of cervical cancer in low-resource settings that lack health-service infrastructure and trained personnel [6].
Per standard of care, women with positive screening are referred to colposcopy, and suspicious lesions are biopsied and evaluated to guide follow-up and treatment plans [7]. Recent studies have demonstrated that a low-cost high-resolution microendoscope (HRME) can improve detection of precancerous lesions by providing real-time histological information before biopsies are taken [8][9][10][11][12]. Parra et al. showed that the specificity of the HRME for detecting precancerous lesions (grade 2 or higher cervical intraepithelial neoplasia [CIN2 + ]) was higher than that of colposcopy in 174 women enrolled in a range of clinical settings, suggesting that 21% fewer unnecessary biopsies could be spared with in vivo microscopic imaging at the point-of-care [12]. In another prospective randomized trial involving 200 women with abnormal Papanicolaou tests in rural Brazil, Hunt et al. showed that women referred to mobile vans equipped with HRMEs were more likely to complete diagnostic follow-up compared with those referred to central hospitals (87% vs. 64%) [11].
While promising results have been reported using the widefield microendoscope, its imaging contrast degrades in turbid and highly scattering tissue due to out-of-focus background signal. In the cervix, increased nuclear density is commonly observed in squamous lesions associated with cancer progression such as CIN2+, as well as in columnar (glandular) epithelium. Since most precancerous and cancerous cervical lesions occur at the intersection of squamous and columnar tissues, it is critical to discern between these two tissue types and identify abnormalities around the squamocolumnar junction (SCJ). When imaging these sites, scattered light compromises the capability of microendoscope imaging to accurately characterize disease-associated nuclear features. As a result, structured illumination has been employed to investigate cervical lesions with improved background rejection [13,14]. Other optical sectioning techniques based on mechanical scanning or spectral encoding have also been employed to image highly scattering media in a wide range of ex vivo and in vivo clinical applications [15][16][17]. More recently, we implemented digital line-scanning apertures in the microendoscope using a digital light projector (DLP) and a CMOS sensor, which allows confocal background rejection at a low cost [18][19][20].
In this study, we optimize the confocal microendoscope to improve its portability and clinical ease-of-use, which are essential features for its clinical translation in low-resource and community settings. To facilitate technology dissemination, we develop an automated algorithm for fine alignment of confocal apertures that can be easily misaligned during device transportation. The automated and fast (< 2 mins) calibration procedure tunes confocality at the micron level and ensures high quality confocal imaging, which would otherwise require manual alignment of mechanical parts by a trained operator. As real-time imaging is displayed to the clinicians in confocal mode, we save two sequential frames in widefield (non-confocal) and confocal modes when image acquisition is triggered using a foot pedal. The sequential image acquisition in two modes allows us to evaluate contrast improvement due to confocal background rejection with clinical data in identical fields of view (FOVs) during ex vivo imaging, and similar FOVs during in vivo imaging as the fiber is hand-held.
We evaluate performance of the optimized confocal microendoscope when imaging ex vivo cervical specimens that harbor various nuclear features of clinical importance. Furthermore, we report its first in vivo evaluation for detection of cervical precancerous lesions in a mobile clinic along the Texas-Mexico border. The mobile clinic serves the Rio Grande Valley, a medically underserved area that suffers from a cervical cancer incidence rate about 55% higher than the country average in the US [21]. Finally, we assess the benefits of confocal scanning for the visual recognition and characterization of nuclear morphology by microendoscope users.

Confocal microendoscope: optical design and mechanical assembly
The confocal microendoscope shown in Fig. 1(A) is a probe-based fluorescence confocal microscope that uses a DLP (LightCrafter 4500, Texas Instrument, Dallas, Texas) as the light source and a CMOS sensor (Firefly MV USB 2.0, FLIR Integrated Imaging Solutions Inc., Richmond, Canada) as the detector. To adapt the commercially available DLP for fluorescence microscopy, the stock lens on the LightCrafter 4500 was removed, and a custom mount was precision machined to couple the DLP with a standard 30 mm cage system (Thorlabs, Newton, New Jersey). Programmed line patterns on the digital micromirror device (DMD) were focused through a collimation condenser (f = 125 mm, LA1986, Thorlabs, Newton, New Jersey) and a 10x objective (RMS10X, Thorlabs, Newton, New Jersey). To enable confocal imaging, line patterns illuminated by the blue LED of the DLP was projected onto the fiber bundle (FIGH-30-850N, Myriad Fiber imaging; 790 µm diameter, compared to a 2 × 4 mm bite of the biopsy forceps used in this study) and scanned across the FOV without mechanical scanners. Fluorescence from tissue in contact with the fiber bundle was collected and descanned by the CMOS sensor rolling shutter aperture. Using commercially available components, the confocal microendoscope was built in a compact enclosure (14"x10"x3"). Figure 1(B) shows the compact confocal microendoscope used alongside colposcopy in an examination room inside a mobile clinic in the Rio Grande Valley, Texas. In this study, confocal images are displayed in real time at 7-8 frames per second and image acquisition is triggered using a foot pedal. Upon a foot pedal trigger, we capture and save two adjacent frames in confocal and widefield (non-confocal) modes, which allows us to evaluate confocal background rejection using clinical data of an identical or similar FOV. The illumination (blue rectangles) and detection (green parallelogram) aperture design in the two imaging modes are shown in Fig. 1 Detailed descriptions of confocal aperture sequences were previously provided [19]. Briefly, the detection slit aperture is achieved on the CMOS sensor by fixing its exposure time to limit the number of CMOS rows that are under exposure concurrently. During readout of each frame, a matching line illumination sequence is programmed on the DLP and projected in a synchronized manner without the need for mechanical scanning. In each frame acquisition, a trigger is sent from the DLP to the camera to ensure precise synchronization of confocal apertures, and their fine spatiotemporal alignment is illustrated in the left panel of Fig. 1(C), with the aperture widths described in CMOS rows. The confocal apertures are linearly scanned over the CMOS containing 1048 rows (960 active rows within the FOV) during a single readout of 65 ms. Notably, we implement parallel illumination to reduce the aperture width, and thus, improve the axial response when using an off-the-shelf projector with limited internal storage. In this study, we use a fixed exposure time of 1303 µs, which corresponds to readout of 21 CMOS rows and a line aperture width of 21.9 µm at the sample plane.
In the widefield mode in Fig. 1(C), the line-scanning illumination is replaced with widefield illumination. Taking advantage of the convenient programmability of the DLP on a per frame basis, widefield images can be acquired seamlessly without interrupting the real-time confocal stream.

Automated confocal aperture alignment
Clinical ease-of-use and robustness are essential for new technologies to reach populations in low-resource settings [22]. The inexpensive components in the confocal microendoscope allow for optical sectioning at a low cost, but their fine alignment at the micron level can be vulnerable to motion during device transportation. As shown in Fig. 1(A), two right-angle mirrors can be used to align confocal apertures mechanically, but the manual adjustment requires training to properly adjust mechanical parts within the enclosure. In addition, subtle differences at the microscopic level can be challenging to discern even for experienced users. To overcome these barriers for technology dissemination, we implement a fast (< 2 mins) and automated alignment algorithm that can be performed to ensure high quality confocal imaging by operators with minimal training.
The compromised imaging quality due to aperture misalignment is illustrated in confocal images of lens paper in Fig. 2(A). First and foremost, the misaligned image captures more scattered light instead of in-focus fluorescence. As a result, it suffers from reduced image brightness and contrast, while showing significantly higher out-of-focus background. In addition, the brightness loss is accompanied by a periodic pattern due to the discrete nature of DLP aperture projection shown in Fig. 1(C). Specifically, CMOS rows that are far from the DLP illumination aperture centers reveal increased signal loss. This pattern can be analyzed by plotting the average intensity of each CMOS row along the scanning direction in Fig. 2(A) (rows 300-700 are shown). The magnitude of the periodic scanning artifacts is then extracted by measuring the peak values (asterisks in Fig. 2(A)) at the corresponding frequency in the Fourier spectrum.
Since both aperture sequences are linearly scanned, the spatial aperture misalignment can be calculated as a proportional temporal offset. This allows us to align the apertures by conveniently programming a temporal delay without the need for mechanical adjustment. To facilitate automated alignment with improved accuracy, we create a stationary scene in the FOV by connecting the confocal microendoscope with a simple and small calibration target. The target in Fig. 2(B) consists of a short fiber bundle secured in an SMA connector, which is covered with solidified fluorescent acrylic paint at its distal end to provide fluorescence signal. The temporal offset is then characterized with a misalignment score, defined as the scanning artifact magnitude in the Fourier spectrum in Fig. 2(A) divided by the mean image brightness. During the calibration, a series of images are acquired with added temporal delays from −1000 µs to + 1000 µs to accommodate for the range of fine misalignment, which correspond to a spatial range of 32 CMOS rows (in comparison, the detection aperture width is 21 CMOS rows). The temporal delay is varied with a 40 µs increment, smaller than the readout duration of a single CMOS row. Figure 2(C) shows representative images of the calibration target obtained with varied temporal offsets (−800 µs, −400 µs and 0 µs, respectively). As the temporal offset is reduced, the scanning artifacts are attenuated, and the image brightness is increased. We also note that while imaging artifacts are present with a temporal offset of −400 µs, it can be difficult to visually recognize them without examining the fine structures. By plotting the misalignment score in Fig. 2(D), the apertures can be aligned with a temporal delay that minimizes the misalignment artifacts. Using the automated algorithm, the alignment can be performed by simply placing the calibration target and running the built-in program, which replaces the previously described mechanical alignment and can easily be performed by users with less training.

Ex vivo imaging protocol
The confocal microendoscope was first evaluated by imaging ex vivo cervical specimens from loop electrosurgical excision procedures (LEEPs). In the ex vivo protocol, patients aged 21 years or older that were undergoing a LEEP for cervical precancer at The University of Texas MD Anderson Cancer Center (MD Anderson) or Lyndon B. Johnson Hospital (LBJ) were offered enrollment. Following the LEEP procedure, the cervical specimens were immediately stained with topically applied proflavine (0.01% in PBS) and imaged ex-vivo using the confocal microendoscope. After imaging, the specimens were processed per standard of care. The study was approved by the Institutional Review Boards (IRBs) at MD Anderson, LBJ and Rice University.

In vivo imaging protocol
The usability and imaging performance of the confocal microendoscope was further evaluated in vivo. As part of a multi-center prospective study (Clinical Trial ID: NCT02420665), women aged 21 years or older who had an abnormal screening Pap and/or positive HPV test were enrolled at the UT Health Mobile Clinic in the Rio Grande Valley, Texas. Visual inspection with acetic acid (VIA) and standard colposcopy were first performed. Microendoscopic imaging was then performed with topically applied proflavine (0.01% in PBS) by a physician assistant (PT). Abnormal areas based on VIA and/or colposcopy were imaged and biopsied, and pathology specimens were diagnosed by the institutional pathologist per standard of care. The study was approved by the IRBs at the University of Texas Medical Branch in Galveston, Rice University, the University of Texas MD Anderson Cancer Center, and the University of Texas Health Science Center at Houston, School of Public Health.

Quantitative and visual assessment of nuclear features
The enhancement in the overall image contrast in confocal images was evaluated using a simple global contrast (GC) metric defined below:

GC = standard deviation(pixel intensities) mean(pixel intensities)
We note that the metric above, as a simple measure of the global contrast, has its limitations in assessing the local contrast of key clinical features such as glandular borders and individual nuclei. As a result, in ex vivo images acquired with a stationary fiber mount that allowed accurate co-registration, we also quantified line profiles across representative clinical features including cell nuclei, glandular patterns and microvascular networks. In in vivo images acquired during imaging sessions where the fiber probe was held by hand, the contrast was visually assessed in selected regions of interest (ROIs), and image features were compared to the gold standard of histopathology. To facilitate visual assessment in these ROIs, a low-pass Gaussian filter was applied to remove fiber bundler patterns.
We also compared the ability of the non-confocal and confocal imaging modes to resolve individual nuclei when images were visually examined by experienced users. Since the background scattering depends on nuclear density, which can be highly heterogeneous, we examined three sites from both squamous and columnar epithelium showing different clinical features. In each site, three regions of interest (ROIs) of 200 × 200 µm 2 showing representative imaging features were selected for review by a cohort of three experienced users. To facilitate visual inspection, images were preprocessed with a low-pass filter to remove the fiber bundle patterns; a band-stop filter was applied in the Fourier domain on the confocal images to remove the minimal residual line-scanning artifacts as previously described in Fig. 2(A). During the visual assessment, individual nuclei that were in focus and could be clearly outlined within each ROI were labeled by the cohort. We then compared the number of nuclei that were identified using a paired t-test to assess whether confocal imaging could allow users to better identify nuclei in these sites. Figure 3 shows images from ex vivo LEEP specimens and reveals various features of clinical importance. As expected, confocal images reveal higher global contrast than widefield images in all the sites. In site A, the squamous epithelium was characterized by evenly spaced nuclei embedded in heterogeneous background, which could be from non-specific staining of keratin in the ectocervix. With confocal scanning, the background signal was suppressed, showing improved delineation of nuclei. The contrast improvement with confocal imaging was more prominent in the columnar epithelium in site B. In widefield mode, the imaging contrast was impaired by increased out-of-focus background. With confocal scanning, the nuclei could be clearly discerned, even on glands with elevated nuclear density. Images of site C revealed further nuclear crowding and architectural effacement across the entire FOV, accompanied with microvessels that could be associated with angiogenesis during disease progression. In the confocal image, individual nuclei in crowded regions were better highlighted, and negatively stained vessels were also outlined with higher contrast. All these qualitative findings were confirmed in the corresponding line profiles, which demonstrated better delineation of nuclei (asterisks), glands (black brackets) and vessels (arrows) by the confocal microendoscope, especially in sites B and C.

Results
We further evaluated the in vivo diagnostic performance of the confocal microendoscope in Fig. 4. In each panel, the site was imaged in widefield (left) and confocal (right) modes. The fluorescence imaging features were compared to histopathology results as the gold standard, and selected ROIs showing representative nuclear features were also compared. Benign squamous epithelium in the ectocervix (sites A -C in green boxes, Fig. 4) are consistently characterized by small and evenly spaced nuclei. It was noted that images in Fig. 4(C) showed slight nuclear crowding compared with Fig. 4(A) and 4(B), consistent with its pathology diagnosis indicating mild atypia. As the disease progressed to neoplasia, nuclear density was significantly increased in Fig. 4(D). In addition to nuclear crowding, Fig. 4(E) shows the presence of disrupted glandular patterns (white arrows) at 9-3 o'clock, consistent with the histopathology diagnosis that reports CIN 2 with glandular involvement. Figure 4(F) also revealed partially effaced glandular architecture (white arrows) in most of the FOV, which was confirmed by the diagnosis of CIN 2 at the SCJ. In short, images in Fig. 4 revealed alterations in nuclear morphology and glandular architecture to distinguish neoplastic lesions from benign sites, and these imaging features were confirmed with the corresponding histopathology. Importantly, subtle changes during disease progression, such as atypia, glandular involvement and metaplasia, were also highlighted in the confocal images. During in vivo free-hand imaging, pixel-level co-registration in two imaging modes could not be guaranteed to allow for direct comparison of imaging contrast. Nonetheless, substantial background rejection is clearly observed in confocal images, especially in regions with high scattering such as 6-12 o'clock in site B, the center of sites D and F, and 7-1 o'clock in site E. In each panel, widefield (left) and confocal (right) images are shown with the corresponding histopathology diagnosis provided above each image pair. In vivo confocal imaging reveals morphological alterations associated with cancer progression, which are consistent with histopathology. Compared with widefield mode, the confocal mode better highlights these architecture changes, especially in regions with crowded nuclei and high out-of-focus background. White arrows: glandular involvement. GC: global contrast. Circular FOV: 790 µm.
Qualitative and quantitative comparisons in Fig. 3 and 4 suggest that the confocal microendoscope has the potential to help users better recognize and characterize nuclear features. To test this hypothesis, images with varied nuclear density and background levels in Fig. 5 were examined. In the squamous epithelium in site A, individual nuclei could be outlined in both non-confocal and confocal modes even though the background was higher in the widefield image. In comparison, sites B and C suffered from low imaging contrast without confocal scanning. In each site, three ROIs (white rectangles, 200 × 200 µm 2 each) showing representative features were reviewed by experienced microendoscope users. The total nuclear count in each ROI was then plotted in Fig. 5. As expected, there was no statistically significant difference in nuclear count between widefield and confocal images in site A using a paired t-test (p = 0.754). In sites B and C, however, the nuclear count identified in the confocal images were significantly higher than in widefield images (p = 0.015 and p = 0.002, respectively). Interestingly, while more variations in the nuclear count were observed in sites B and C due to increased tissue heterogeneity, dense regions consistently showed high nuclear count in both imaging modes. In addition, most of the nuclei identified in widefield images (88.7%, 85.4% and 91.3% in sites A, B and C, respectively) were also located in the confocal mode. These results show that the confocal microendoscope can improve the accuracy of visual characterization of nuclear architecture in highly scattering regions.

Discussion
Here, we describe a low-cost (<$5,000), portable and user-friendly confocal microendoscope optimized for clinical use. We first evaluated its performance when imaging a variety of clinically important structures in ex vivo cervical lesions. Its ability to detect and characterize early cervical cancers in vivo was then validated in a mobile clinic in a medically underserved community.
When compared with widefield imaging, we demonstrate that the confocal microendoscope enhances the imaging contrast of benign and neoplastic lesions. The capability to provide real-time pathological information with optical sectioning, combined with its affordability and clinical ease-of-use, make it well suited for cervical lesion characterization at the point of care in low-resource settings, especially in settings that lack established pathology services.
A major barrier to imaging disease-associated morphological alterations is scattering in turbid biological tissue. In reflectance microscopy, imaging contrast is generated by the refractive index gradient, which necessitates the use of optical sectioning techniques to reject scattered background [16,17,23]. In comparison, fluorescence imaging promises a higher light yield and improved imaging contrast. When using contrast agents that predominantly accumulate in specific anatomic sites or cellular components, background signal from defocused image planes can also be reduced [24], [25]. Consistent with previous findings, our results demonstrate that widefield imaging with proflavine staining is capable of resolving the stratified and evenly spaced nuclear architecture in normal cervical squamous epithelium [26,27]. In the presence of increased nuclear density, however, the contrast of widefield microendoscope images can be hampered by the lack of background rejection. In the cervix, nuclear crowding is often observed in benign columnar epithelium, metaplasia, as well as disease-associated lesions such as atypia and precancers. When surveying these regions using the microendoscope probe, confocal scanning can significantly improve the image contrast of clinically important features.
The contrast improvement in confocal microendoscope images allows experienced users to better recognize and characterize nuclear structures in highly scattering regions. Previously, an automated algorithm was developed to diagnose cervical precancer in widefield microendoscope images based on nuclear size, eccentricity and density, and diagnostic accuracy comparable to standard colposcopy was reported prospectively [11]. While it was impractical to assess the diagnostic algorithm in a small set of confocal microendoscope images, our initial visual assessment by experienced users showed consistent numbers of nuclei identified in widefield and confocal modes in squamous epithelium (site A in Fig. 5), suggesting that the existing algorithm can be readily adapted to quantify nuclear morphology in confocal microendoscope images. In sites with increased scattering (sites B and C in Fig. 5), the users could delineate nuclei in the confocal microendoscope images that would otherwise be challenging to resolve in widefield images. These findings support the potential of the confocal microendoscope to characterize nuclear morphology with improved accuracy in a quantitative and automated manner.
The strengths of this study include that a wide range of benign and pathologic conditions were imaged and compared to the gold standard of histopathology. Moreover, we were able to evaluate and quantify the benefits of background rejection with both ex vivo and in vivo clinical data. Finally, in the pilot trial, the confocal microendoscope was operated by physician assistant in a mobile clinic, demonstrating the feasibility of the device to be widely adopted in similar settings. The major limitation of the current pilot trial is its limited sample size, which makes it impractical to evaluate the diagnostic performance via visual assessment or automated algorithms. In this study, we compared key clinical features with histopathology, and we evaluated the benefits of background rejection as the images were reviewed and interpreted by expert users. The preliminary results warrant future larger studies to establish its diagnostic performance and to develop automated algorithms for image interpretation. In addition, the reported digital aperture approach provides a versatile and convenient means for optical sectioning, which can be implemented to image other fluorophores by optimizing design parameters such as exposure time and aperture width.
In conclusion, we present a portable, low-cost and robust confocal microendoscope with an automated alignment feature, and we demonstrate its initial clinical evaluation for early detection of cervical precancer. The digital confocal apertures in the confocal microendoscope are highly programmable and allow us to evaluate the benefit of confocal imaging with clinical data. Our preliminary results show that it has great potential to improve real-time characterization of nuclear morphology in cervical lesions, and future studies are warranted to further evaluate its performance in a larger population, especially with automated diagnostic algorithms.