Single pixel camera ophthalmoscope

Ophthalmoscopes to image the retina are widely used diagnostic tools in ophthalmology and are vital for the early detection of many eye diseases. Although there are various effective optical implementations of ophthalmoscopes, new, robust systems may have a future practical importance in cases where ocular media present significant opacities. Here, we present, as a proof of concept, a novel approach for imaging the retina in real time using a single pixel detector combined with spatially coded illumination. Examples of retinal images in both artificial and real human eyes are presented for the first time to our knowledge. © 2016 Optical Society of America

Ophthalmoscopes to image the retina are widely used diagnostic tools in ophthalmology and are vital for the early detection of many eye diseases. Although there are various effective optical implementations of ophthalmoscopes, new, robust systems may have a future practical importance in cases where ocular media present significant opacities. Here, we present, as a proof of concept, a novel approach for imaging the retina in real time using a single pixel detector combined with spatially coded illumination. Examples of retinal images in both artificial and real human eyes are presented for the first time to our knowledge. Modern flood-illuminated fundus cameras use a CCD or CMOS camera with an array of thousands of pixels to obtain an image, or video, of a part of the retina [1]. Good image quality is achieved by illuminating the full area of interest homogenously. When aberrations are corrected with the use of adaptive optics, even detailed structures of the photoreceptor mosaic can be obtained [2,3]. On the other hand, a scanning laser ophthalmoscope (SLO) scans a light spot rapidly across the retina [4]. The reflected intensity of each traversed point is recorded using a spot detector such as a photomultiplier tube (PMT) [5] or an avalanche photodiode (APD) [6]. Subsequently, a high-resolution image of the retina is composed from the acquired data. Most recently, digital light ophthalmoscopes were developed using a digital micromirror device (DMD) to scan lines across the retina, which are imaged either on a line camera or on a two-dimensional (2D) detector array to obtain an image [7,8]. All the former described ophthalmoscopes are state-of-the-art devices and widely used as medical instruments for monitoring the condition and alteration of the eye fundus. Here, we introduce another imaging modality, the single pixel camera technique, to the field of ophthalmic imaging. The proposed system images the retina by combining a spatially coded illumination with a single pixel detector in a double-pass configuration. Single pixel imaging gained interest within the last decade and has been successfully applied in biological imaging and microscopy [9,10], infrared imaging [11][12][13], ultrasonic imaging [14], terahertz imaging [15], and three-dimensional (3D) imaging [16,17], and it has also been used in ghost imaging [18][19][20]. This type of system avoids the need of a scanning unit, which simplifies the optical system. The additional downscaling to a single pixel detector, instead of a pixel array, might lead to a lower illumination power, which benefits the patient's comfort.
The system (Fig. 1) consists of a DMD (Vialux V7001, controller board V4395, chipset DLP 4100, pixel size: 13.68 μm, resolution: 1024 × 768 px). Each mirror pixel on the DMD can either guide light into the optical system or deflect it. The DMD is illuminated homogenously with a broadband xenon lamp (Hamamatsu L7810-02) with the UV part of the spectrum blocked. This light source guarantees a stable output over time, which is essential for discriminating minute intensity differences coming from the spatial coding of the later-described pattern.
The illumination is spatially coded by multiple 2D binary (black and white) patterns based on the Walsh-Hadamard transformation. These patterns are considered to have an optimum weighted design for spatial coding. The number of black and white pixels are equally distributed within each pattern and are preferred to others, e.g., pseudo-random binary patterns [18,21]. The so-called Hadamard patterns or masks are generated with an in-house developed software in C++ and sent to the DMD via a USB 3.0 connection into the DMD's onboard memory. It is essential to pre-load all patterns before displaying them, as this is the only way to reach the maximum DMD frame rate of 22.727 kHz. Subsequently, the patterns sequence is initialized within an area of 512 × 512 pixels in the center of the DMD. The masks are scaled to match the size on the DMD, meaning e.g., for N 32, each pixel of the pattern corresponds to 16 pixels on the DMD. Next, the patterns are imaged through the lenses L1, L2, L3, L4, and the eye's optics onto the retina covering an area of around 15 deg of the visual field. The light intensity reflected from the retina is measured through the crystalline lenses, L5, L6, and L7 with a silicon photomultiplier (SIPM, Excelitas Lynx A-33-050-T1-A), the single pixel detector. This is synchronized with the DMD, which triggers the intensity measurement of each projected pattern individually. The diaphragm D1 is placed centrally on the back plane of lens L2, imaged via the telescope formed by lenses L3 and L4 onto the pupil plane, and it therefore defines the beams' entry point. This configuration lowers the impact of aberrations on the image of the pattern and minimizes the area of possible backscattering and reflections. The data from the photomultiplier is transferred via an analog-to-digital converter (National Instruments PCIe-6361, sampling rate 2 MS∕s) to a desktop PC (i5-4590, quad-core, 3.3 GHz, 8 GB RAM). The high sampling rate of the analog-to-digital converter (ADC) allows for 84 measurements per pattern. Not all data points are used for further processing, since data from the start of each pattern measurement is affected by minute mirror wiggling while the mirrors settle. The data received for each pattern are averaged and stored as intensity i per pattern. A full reconstruction of an image with the resolution N needs the amount of N 2 patterns to be displayed, meaning for N 32, n N 2 1024. Since the mathematical description of the binary Hadamard pattern consists of negative and positive ones −1; 1, and the DMD is only able to display zeros and positive ones 0; 1, there is a need to display twice the number of patterns, resulting in n 2048 (equals 2 N 2 ) for N 32. The patterns are displayed in the following order: a positive pattern (resembling patterns with 0 and 1) and, subsequently, its inverted complement (patterns with −1 and 0). Afterward, the data from the inverted masks are subtracted from the data of the positive masks, resulting in the final data corresponding to the correct mathematical model of the Hadamard patterns −1; 1. With this displaying order, also used in other studies [9,10,22], noise is almost completely eliminated and the data complements the optimal reconstruction. One might argue that displaying only the positive patterns and subsequently subtracting the average of all intensity measurements would resemble the Hadamard patterns adequately, and, therefore, be sufficient to reconstruct the object, but that only works satisfactorily under optimum illumination conditions. In a low-light environment, noise induced by the photomultiplier, the light source, or the environment (Status LED, PC display) may have a severe impact on the image reconstruction [23]. The reconstruction of the object is done by the use of the following equation: where pattern i m; n denotes the set of matrices of the Hadamard basis with dimensions N × N , consisting of "1 s" and "−1 s." The intensity i represents the averaged intensity per pattern after the subtraction of the negative data from the positive data, and m; n are discrete spatial coordinates. Note that the values of intensity are the coefficients of the transformation of the image into the Walsh-Hadamard basis, and Eq. (1) just transforms back to the spatial domain. Figure 2 shows the obtained intensity measurements using a model eye with a lens f eye 20 mm, ∅12.7 mm, and an imprinted letter acting as a retinal surrogate after imaging with a resolution of N 32 and a total of 2048 patterns. Figure 2(a) shows the original object with a black square the size of 5 mm × 5 mm and printed letters with four different grayscale values on a white letter. Figure 2(b) shows the image of the reconstructed object obtained with the maximum DMD frame rate and up-sampled to 128 × 128 pixels without interpolation for better viewing purposes. A detailed image reconstruction is already possible due to the object's simple structure. Figure 2(c) displays the raw intensities, measured 84 times for each of the 2048 patterns. At this point, all the obtained measurements are positive. The final data are shown in Fig. 2(d), after the subtraction and averaging are done; consequently, the computed data contain negative values and vary around the zero baselines. Eventually, this set of data is used to reconstruct an image of the object, as depicted in Fig. 2(b), by performing Eq. (1). Figure 3 shows additional results using a model eye but imaged with various resolutions, proofing the feasibility of the presented optical system. It is worth to mention that for N 256, only all positive patterns (n N 2 65536; all patterns would result in n 2 N 2 131072) are shown due to the restrictions of the DMD's memory, which cannot store more than 87380 binary patterns. In this case, the displaying time only doubles rather than quadrupling (compared to N 128, see Table 1), but the reconstructed images are of lower quality, since the noise effects are not compensated effectively.
The images shown in Fig. 3 are reconstructed without any applied image post-processing except for N 256, where contrast stretching is applied since the dynamic range was very narrow. The quality of the images could be further improved by lowering the frame rate of the DMD. That might be suitable for measurements with a static object, but during in vivo measurements, subtle eye  Letter movements would affect the image quality drastically, since the area of illumination needs to be constant. Furthermore, increasing the imaging time would be contrary to the ophthalmoscope's desired real-time operation capability. One pixel in an N 32 reconstruction equals around 0.18 mm. Consequently, doubling N results in halving the pixel size, meaning that when N 256, one pixel equals 0.022 mm. The scale bar is indicated within Fig. 3.
We were able to record images and video streams with the proposed optical instrument. Table 1 shows an overview of the achieved frames per second (FPS) during the continuous acquiring and reconstruction of images when operating in real-time mode. The mathematical calculations to reconstruct an image are simple, but the frame rate drops fast with the increasing resolution not only because of the exponential growth of pattern's size to be displayed, but also due to the vast amount of data that needs to be processed. Additionally, we compared our video rate to those obtained by other groups using the same imaging technique outside the field of retinal imaging.
The following results are obtained in living human eyes while using multiple frame rates of the DMD and the total amount of n 2 N 2 patterns. Figure 4(a) presents an initial unprocessed result taken in a young subject volunteer with an illumination time of 0.54 s and a resolution of N 32. The images on the right in Fig. 4 show the average of 10 frames using the same settings as the left ones. Averaging was done without correcting for possible eye movements between each frame. Image postprocessing is not applied, but the image is up-sampled to a size of 128 × 128 pixels. The main blood vessels, which join at the optical nerve head, can be distinguished clearly. Further details are not visible due to the lack of resolution and the presence of noise. Higher-resolution images obtained from the averaged video frames (taken from Visualization 1 for N 64 and Visualization 2 for N 128) are shown in Fig. 5.
Compared to the results shown in Fig. 4, more details of the vessel map around the optical nerve head are visible. The improvement is not as good as anticipated, which is due to the increased amount of time it needs to display the full set of patterns, and therefore, eye motions influence the outcome severely. If the subject moves, even subtlely, different areas of interest are illuminated, and this results in useless reflection intensity measurements. Furthermore, with the increase of resolution comes an increased level of noise. Images taken with a resolution of N 256 are not shown here due to the following reasons: the overall quality is low, as the imaging time for one single frame is around three seconds (at maximum speed) and, therefore, eye movements have a drastic effect on the reconstructed result. Second, with the current hardware, we are not able to project the inverted patterns, which are necessary to compensate for noise effects. As Figs. 4 and 5 show, an increase in resolution does not necessarily improve the image quality, but rather introduces high-frequency noise. This is mainly when finely structured patterns are not projected perfectly onto the retina (due to aberrations) and, therefore, introduce a light haze on the reconstructed image, as shown in Fig. 6. Nevertheless, the extended imaging time combined with a non-constant area of illumination, due to the locomotion of the eye, is the most challenging issue. To reduce the influence of motion artifacts, it is necessary to keep the imaging time as short as possible. For that reason, it is possible to remove masks from the display sequence, since some masks do not provide significant information. The technique of displaying less patterns is known as adaptive imaging or evolutionary compressive sensing [10,17]. To Fig. 3. Results using a model eye (raw images without image postprocessing). Top row: single frames, bottom row: the average of 10 frames. The DMD is set to its maximum display rate. Note: For N 256, only the positive patterns were displayed, hence the increased noise and artifacts. Imaging time per frame is indicated. Theoretical FPS is solely based on the maximum frame rate of the DMD without further data processing.  Letter gain knowledge about the patterns that can be eliminated, a full set of 2 N 2 masks needs to be displayed prior to classification. Subsequently, the obtained intensity data are sorted by means of highest responsivity and then loaded into the memory of the DMD to provide instant availability.
Afterward, a threshold is chosen to separate the patterns into useful and dispensable. From there, we are left with two options to gain an advantage of the lower amount of patterns: (i) Increasing the illumination time per pattern, T , hence gaining a longer integration time for the detector area but maintaining the same overall illumination time, t, which is not desired.
(ii) Keeping the illumination time per pattern, T , constant, results in a lower imaging time per frame. Figure 6 shows results of the latter adaptive imaging configuration for 100, 75, 50, 30, 25, and 10% of 2 N 2 patterns in the model eye setup and a resolution of 128 × 128 pixels. As can be seen, displaying less than 30% of the patterns is sufficient for a good reconstruction of the object but highly dependent on the object's structure and its grade of detail. With fewer patterns displayed, the haze (high-frequency noise) in the reconstructed images disappears and the real-time video frame rate increases significantly. The imaging times per frame are indicated within the images. Another technique that is commonly used with a single pixel detector and a reduced amount of patterns is compressive sensing (CS) [12,16,21,24,25]. As it is based on a timeconsuming statistical algorithm, we omit CS for the moment as we put our focus on a real-time optical instrument.
In conclusion, we described a novel instrument, based in a single-pixel detection, for imaging the fundus of an eye in vivo within an area of almost 15 visual degrees. Images and videos taken in a static artificial eye demonstrate its feasibility and capability, while imaging in vivo remains challenging mainly due to unintentional eye movements during extended illumination times. In terms of image quality and resolution, the obtained results cannot compete with comparable ophthalmic imaging instruments, since the main limitation is the maximum frame rate of the DMD. Future technology improvements and customized hardware might close that gap. In addition, one can expect that this type of single-pixel ophthalmoscope could operate under a larger range of eye conditions, such as increased ocular aberrations or scatter [26], as in the case of cataract patients [27]. Other potential advantages would be to obtain retinal images in spectral bands beyond the current camera technology limits.
Funding. European Research Council (ERC) (ERC-2013-  The first row depicts a single frame, while the bottom row shows an average of 10 frames. The DMD was set to its maximum frame rate.