Simulating the perceptual effects of electrode–retina distance in prosthetic vision

David Avraham; Yitzhak Yitzhaky

doi:10.1088/1741-2552/ac6f82

1. Introduction

Retinal prosthetic devices that electrically stimulate the retina are potentially a great way to treat blindness caused by retinitis pigmentosa and geographical atrophy [1–5]. The psychophysical aspects of prosthetic vision have been studied intensively in the last few decades but are not yet fully understood. This is likely because of the various spatial and temporal aspects that affect prosthetic vision, the small number of users (500 globally) [6], and the variations among the users regarding their spatial and temporal perception [2, 7–25].

Various retinal prostheses have been developed and implanted in patients over the last few decades [6]. One of these prostheses, the epiretinal prosthesis Argus II (Second Sight Medical Products, CA, USA), received both FDA and CE mark approval [6]. It consists of intraocular and extraocular units. The external unit includes a head-mounted miniature camera, a signal and power transmitting coil, a video processor, and a battery. The intraocular unit includes a receiving coil and an electrode array that consists of 10 × 6 platinum electrodes, each 200 μm in diameter with 575 μm center-to-center spacing. The external unit captures a real-time video of the outside world and transmits it wirelessly to the internal unit inside the eye; this information reaches the electrodes in electrical pulses. The prosthesis covers 19° × 11° in the visual field [26]. There are currently more than 350 Argus II users, exceeding the number of all the other retinal prostheses users combined [6]. The Intelligent Retinal Implant System (IRIS) II device (Pixium Vision SA, Paris, France) includes 150 platinum electrodes, each with a diameter of 250 μm [27]. Another epiretinal prosthesis, the Learning Retinal Implant System (Intelligent Medical Implants (IMI) GmbH, Bonn, Germany), includes 49 platinum electrodes [28], each with a diameter of 250 μm [6].

The electrode–retina distance (ERD) is the distance between the electrodes in the array and the surface of the retina. The ERD varies significantly within the same electrode array and thus should be measured for each electrode in the array. The ERD varies for several reasons; the electrode array in most retinal prostheses is implanted in the macula [6, 29, 30]. The concave shape of the macula causes large variations in the ERD [31, 32]. The electrode array in the Argus I and II and IRIS II is attached to the retina only by one tack [27, 33]; thus, the electrode array is free to move away above the retina from the attachment point, resulting in a varying distance between the electrode array and the underlying inner retinal surface. The ERD has been measured in Argus II users using optical coherence tomography (OCT), an imaging technology that can produce high-resolution, cross-sectional retina images [34]. It was found to range from 17 to 706 µm [4, 31, 32, 35–38]. The ERD could be much larger (up to 1420 µm) in suprachoroidal prostheses [39]. When the ERD is large, the vitreous humor has a more resistive impact on the current amplitude [40]. Thus, the current amplitude that reaches the electrodes is lower. As this distance increases, the current amplitude required to elicit a neural response (threshold current amplitude) increases with epiretinal prostheses [4, 7, 10, 41], subretinal prostheses [29, 42–44], and suprachoroidal prostheses [39]. Electrodes located above a certain distance from the retina may not be able to evoke phosphenes [4, 10, 41]. For these reasons, the ERD also decreases the ability to perform visual tasks, as has been found in psychophysical experiments [38].

The stimulation process in Argus II may involve the grouping of electrodes [45]. An electrode group is a group of adjacent electrodes that are stimulated simultaneously. Phosphenes evoked by groups of 2 × 2 electrodes (quads) are bigger and brighter than phosphenes evoked by single electrodes [46]. In addition, evoking phosphenes by quads requires a lower threshold current amplitude than evoking phosphenes by individual electrodes because of current summation [45, 46]. Thus, it is recommended to use electrode groups when individual electrodes do not evoke phosphenes [47]. Electrode grouping may compensate for retinal areas that are not being stimulated strongly enough by one electrode to elicit a phosphene, which may be owing to a large ERD, by reaching the threshold current amplitude required for perception. Therefore, quads may be helpful when the ERD reaches a predetermined threshold value. However, quads decrease the resolution by up to a factor of 4 because each electrode represents a specific field of view (FOV), and in quads, they are stimulated by the same signal.

The effects of ERD are pronounced in several spatial aspects, including size [48], brightness [48], and shape [8]. Electrodes closer to the retina evoke more elongated and bigger phosphenes than electrodes more distant from the retina [8]. Several studies developed electrophysiological simulation models of the ERD's impact on the stimulation threshold current and the impedance in epiretinal prostheses [49, 50] and subretinal prostheses [51, 52]. Many others discussed the importance of its effects on perception in prosthetic vision [2, 4, 8, 39, 41]. However, to our knowledge, none of these previous studies simulated the ERD's perceptual effects, nor how different approaches may reduce its adverse impact on prosthetic vision. Simulating and understanding the perceptual effects of ERD may enable the identification and development of technological approaches that may reduce its adverse effects on the quality of retinal prosthetic vision.

In this work, these effects are demonstrated, including random spatial variations in prosthetic vision that have been previously implemented—size, brightness, shape, dropout, and spatial shifts of phosphenes relative to the electrodes' position in the array [53–56]. In addition, three approaches that aim to reduce the adverse effects of ERD on perception are simulated and discussed. One approach is electrode grouping in cases where single electrodes cannot elicit phosphenes. Another approach involves a manipulation of the prosthetic input image using a correction matrix that increases the values of the pixels according to the ERD of each electrode to reduce the perceptual effects caused by ERD. These two approaches and their combination are verified in human-subject experiments and image similarity metrics. The third approach to recovering vision loss caused by large ERDs is object scanning by the natural head movements of the user. This is simulated with and without phosphene persistence. Finally, prosthetic vision with object scanning and image enhancement is simulated and discussed.

2. Methods

2.1. Stimulation model

In Argus II, the electrode array is attached to the retina by a single titanium tack located in the center of one narrow side (see figure 1(a)) [33, 57]. While the tacked side is attached to the retina, the other side is unattached and may tilt away from the retina (see figure 1(b)).

Table 1 describes the terms regarding ERD. Given that the electrode array is implanted over the fovea [57], the largest ERD is likely to be around the center of the electrode array because of the macular concavity, as shown in different OCT studies associated with Argus II users [31, 35–38]. A heat map of the ERDs, which was used in all simulations (except those shown in figures 5(b) and (c)), is presented in figure 1(c).

Table 1. Key terms and definitions regarding the ERD.

Term	Definition
Electrode–retina distance (ERD)	The distance between the electrode array and the retina in mm. As the ERD increases, the elicited phosphenes are darker, bigger, and less elongated (rounder).
Maximum ERD (ERD_max)	The largest ERD in mm. In the simulations, the ERD reaches its maximum value in electrodes 6C and 6D (see figure 1(c)).
Critical ERD (ERD_crit)	An ERD above which there is no perception of phosphenes within the safe charge density limit because electrodes cannot sufficiently stimulate the retinal ganglion cells.
Tilted electrode array	An electrode array tacked to the retina only on one side and thus may be tilted away from the retina (see figure 1(b)).
Electrode group (quad)	Four electrodes are being stimulated simultaneously with the same signal. Quads become necessary for ERD > ERD_crit.

The dependence of the threshold current amplitude on the ERD ranges from linear to an exponent of 2. One study has found that the threshold current amplitude increased with the ERD approximately linearly up to 0.5 mm [4]; other studies have found that the threshold current amplitude increased with the ERD squared in rabbits' retinas [43] and macaque's retinas [58]. This is reasonable as it fits the electric field decay with the square of the distance in an isotropic medium with distant boundaries. Other studies have found that the threshold current increased significantly for larger ERDs (ERD > 0.5 mm), and the relationship is approximately exponent of 2 [7, 39, 41]. In the simulation, a model that fits all these studies was adopted. Therefore, the threshold current dependence on the ERD is defined by:

$\begin{align}{I_{{\text{th}}}}\left( {{\text{ERD}}} \right) = \left\{ \begin{array}{*{20}{l}} {a + b{\text{ERD}},}&{{\text{ERD}} \leqslant 0.5\,{\text{mm}},} \\ {a + b\exp \left( {2{\text{ERD}}} \right)/c,}&{{\text{ERD}} > 0.5\,{\text{mm}},} \end{array}\right.\end{align} \tag{ 1 }$

where a is a parameter that represents the minimal current amplitude required to evoke phosphene at zero ERD, b describes the linear dependency between the threshold current amplitude and the ERD, and the factor c =5.437 was used to make the graph continuous at 0.5 mm. a = 130 µA and b = 0.5 A m⁻¹ were chosen to approximately fit the graph of the experimental data found with Argus II users, for 0.45 ms pulse at 20 Hz stimulation rate [4]. The maximum stimulation of 200 µm platinum electrodes (Argus II [4]) and 250 µm platinum electrodes (IRIS II [27], learning retinal implant system [28]) was 311 µA and 486 µA, respectively, with 0.45 ms pulse, within the safe charge density limit of 0.35 ${\text{mC}}\,{\text{c}}{{\text{m}}^{ - 2}}$ [59]. The graph of the threshold current as a function of the ERD is presented in figure 2(a).

**Figure 2.** The simulated effects of the ERDs on stimulation and perception of phosphenes. (a) Threshold current as a function of the ERD (blue solid line). The black and red dashed lines represent the safe charge density limit for a 0.45 ms pulse duration for 200 µm and 250 µm electrodes, respectively. (b) Phosphene size normalized factor as a function of the ERD for 200 µm (black solid line) and 250 µm (red dash-dot line) electrodes. The phosphene size varies between no change at zero ERD and 100% bigger at 0.7 mm ERD. (c) Phosphene brightness normalized factor as a function of the ERD for 200 µm (black solid line) and 250 µm (red dash-dot line) electrodes. The phosphene brightness varies between no change at zero ERD and completely dark (non-elicited phosphene) at 0.7 mm ERD, as electrodes located at ERDs that require a stimulation above the safe charge density limit cannot elicit phosphenes. (d) The values of ${\sigma _x}$ (black solid line) and ${\sigma _y}$ (red dash-dot line) as functions of the ERD. The phosphene shape as a function of the ERD was equally simulated for both electrode diameters. In figures (a)–(c), the graphs are linear until 0.5 mm and then become exponential at the range of 0.5–0.7 mm, equivalently to the threshold current dependence on the ERD.
Download figure:
Standard image High-resolution image

**Figure 2.** The simulated effects of the ERDs on stimulation and perception of phosphenes. (a) Threshold current as a function of the ERD (blue solid line). The black and red dashed lines represent the safe charge density limit for a 0.45 ms pulse duration for 200 µm and 250 µm electrodes, respectively. (b) Phosphene size normalized factor as a function of the ERD for 200 µm (black solid line) and 250 µm (red dash-dot line) electrodes. The phosphene size varies between no change at zero ERD and 100% bigger at 0.7 mm ERD. (c) Phosphene brightness normalized factor as a function of the ERD for 200 µm (black solid line) and 250 µm (red dash-dot line) electrodes. The phosphene brightness varies between no change at zero ERD and completely dark (non-elicited phosphene) at 0.7 mm ERD, as electrodes located at ERDs that require a stimulation above the safe charge density limit cannot elicit phosphenes. (d) The values of ${\sigma _x}$ (black solid line) and ${\sigma _y}$ (red dash-dot line) as functions of the ERD. The phosphene shape as a function of the ERD was equally simulated for both electrode diameters. In figures (a)–(c), the graphs are linear until 0.5 mm and then become exponential at the range of 0.5–0.7 mm, equivalently to the threshold current dependence on the ERD.
Download figure:
Standard image High-resolution image

2.2. ERD-affected perceptual model

It has been found that electrodes, which are located farther from the retina, may not stimulate the retinal ganglion cells sufficiently, which affects the ability to elicit visual perception within safe charge density limits [4, 7, 10, 41]. The assumption that as the ERD increases, the phosphene's brightness decreases until it falls to the background brightness is valid because the electric field that reaches the ganglion receptive fields is lower as the ERD increases. This assumption is consistent with the literature because the stimulation amplitude decreases as the ERD increases, and a reduction in the stimulation amplitude is associated with the elicitation of darker phosphenes [48]. Therefore, electrodes cannot elicit phosphenes at a certain ERD in the simulation, referred to as ERD_crit. The graph of the phosphene brightness as a function of the ERD is presented in figure 2(b).

In cases where ERD > 0.36 mm, assuming a stimulation of the 200 µm electrodes with a pulse duration of 0.45 ms within the safe charge density limit, the Gaussian function is annihilated, i.e. there is no perception of phosphenes (black pixels). The same goes for 250 µm electrodes given the same pulse duration at ERD > 0.675 mm. The ERD affects the size of phosphenes. As the ERD increases, the stimulation amplitude reaches the retina decreases, and thus the phosphenes become smaller [48]. In the simulation, as the ERD increases, the phosphene brightness decreases. The decrease in brightness results in a perception of a much smaller phosphene because of the Gaussian profile. To control the amount that the size decreases, the function F_S (ERD) is used in equation (2). This function increases the phosphene size as the brightness falls in an ERD-affected phosphene to make the phosphene smaller but not miniature. The graph of the phosphene size as a function of the ERD is presented in figure 2(c). It was found in Argus I and II users that, as the ERD increases, the elicited phosphenes are less elongated or rounder [8]. In the simulation, phosphenes are perceived as round above a relatively low ERD (around 0.3 mm), as users consistently describe phosphenes as round [2, 7, 11, 19, 41, 60–62]. A graph presenting the modeled Gaussian dimensions that represent the phosphene shape as a function of the ERD is shown in figure 2(d).

The simulation was constructed in MATLAB R2020b (Mathworks Inc., Natick, MA, USA) based on the characteristics of the Argus II [33]. It includes an electrode array with 6 × 10 electrodes and is implemented on a desktop display at 144 × 240 pixels resolution. Based on an equation in the form of a 2D elliptical Gaussian function that describes the phosphene's brightness distribution and allows the simulation of size variations, shape variations, spatial shifts, and dropout [53], a more general equation that also considers the phosphene's brightness variations and the ERD effects was developed:

$\begin{align}&f\left( {x,y,{\text{ERD}}} \right)\nonumber\\ &\; = AB{F_B}\left( {{\text{ERD}}} \right)\exp \{ - S{F_S}\left( {{\text{ERD}}} \right)[a{{\left( {x - i{x_0} - {x_1}} \right)}^2}\nonumber\\ & \quad + 2b\left( {x - i{x_0} - {x_1}} \right)\left( {y - j{y_0} - {y_1}} \right) + c{{\left( {y - j{y_0} - {y_1}} \right)}^2}]\} ,\end{align} \tag{ 2 }$

where $f\left( {x,y,{\text{ERD}}} \right)$ is the pixel's value in location (x, y) at a specific ERD. A represents the phosphene brightness at the center; it is the mean value of a pixel block in the high-resolution camera's frame, and it is quantized into four normalized values (0, 1/3, 2/3, and 1). If a dropout is applied, a random 10% of the electrodes cannot elicit phosphenes by setting A to 0, regardless of the stimulus. B and S represent the phosphene brightness and size random variations and are applied to all phosphenes when considered; B is uniformly distributed between 0.7 (i.e. a 30% darker phosphene) and 1, i.e. no change in the brightness of a phosphene; S is uniformly distributed between 0.5 (i.e. a 50% bigger phosphene) and 1.5 (i.e. a 50% smaller phosphene). ${F_B}\left( {{\text{ERD}}} \right) = 1 - {I_{{\text{th}},{\text{norm}}}}\left( {{\text{ERD}}} \right)$ , ${F_s}\left( {{\text{ERD}}} \right) = 1 - 0.5{I_{{\text{th}},{\text{norm}}}}\left( {{\text{ERD}}} \right)$ ; ${I_{{\text{th}}}}\left( {{\text{ERD}}} \right)$ is normalized for convenience, such that ${I_{{\text{th}},{\text{norm}}}}\left( 0 \right) = 0$ and ${I_{{\text{th}},{\text{norm}}}}\left( {{\text{ERD}} \geqslant {\text{ER}}{{\text{D}}_{{\text{crit}}}}} \right) = 1$ , because the maximum allowed current is reached at ERD_crit. At ERD_crit, the phosphenes disappear ( ${F_B}$ (ERD_crit) = 0) and are twice as big ( ${F_S}$ (ERD_crit) = 0.5), because as the value ${F_S}$ (ERD) decreases, the phosphene becomes darker and bigger. a, b, and c, the shape parameters of the elliptic Gaussian, are defined as $a = \frac{{{{\cos }^2}\theta }}{{2\sigma _x^2}} + \frac{{{{\sin }^2}\theta }}{{2\sigma _y^2}}$ , $b = - \frac{{\sin 2\theta }}{{4\sigma _x^2}} + \frac{{\sin 2\theta }}{{4\sigma _y^2}}$ , and $c = \frac{{{{\sin }^2}\theta }}{{2\sigma _x^2}} + \frac{{{{\cos }^2}\theta }}{{2\sigma _y^2}}$ , where ${\sigma _x}$ and ${\sigma _y}$ depend on the ERD and represent the spread of the phosphene in both axes. ${\sigma _x}$ ranges from 1 to 4/3 and ${\sigma _y}$ ranges from 3/4 to 1. $\theta$ is uniformly distributed between zero and 180. When shape variations were applied, ${\sigma _x}$ and ${\sigma _y}$ are randomly varied for 15% of the phosphenes within the same values range in a uniform distribution. The angle $\theta$ represents the clockwise rotation of the ellipsoid-shaped phosphene. The indices $i\,{\text{and }}j$ represent the phosphene's center point coordinates in the x and y axes, respectively. ${x_0}$ and ${y_0}$ are the phosphene's center position coordinates. ${x_1}$ and ${y_1}$ represent the spatial shifts; when applied, all the phosphenes are shifted in a uniformly distributed range from −6 to 6 pixels in the x-axis and the y-axis.

2.3. Electrode grouping

Groups of electrodes, such as quads, become necessary as the ERD increases. Phosphenes evoked by quads are usually bigger than phosphenes evoked by individual electrodes [12, 46]. The relation between the thresholds for evoking a phosphene by a single electrode ( ${T_{SE}}$ ) and by a quad ( ${T_Q}$ ) in the Argus II is T_SE = 2 × T_Q [32]. Figure 3 demonstrates the effect of quads on the threshold current amplitude for both 200 µm and 250 µm platinum electrodes stimulated by a pulse duration of 0.45 ms. Using quad stimulation instead of single-electrode stimulation for 200 µm electrodes increases ERD_crit from 0.36 mm to 0.685 mm.

**Figure 3.** Threshold current as a function of the ERD for a single-electrode and a quad stimulation for platinum electrodes with a diameter of (a) 200 µm and (b) 250 µm. The graphs are linear until 0.5 mm and then exponential above 0.5 mm. The black and red dashed lines represent the safe charge density limit for a 0.45 ms pulse duration for 200 µm and 250 µm diameter electrodes, respectively. The dashed and solid pink lines represent the critical ERDs, above which single electrodes and quads cannot evoke phosphenes within the safe charge density threshold, respectively. The critical ERD for single 200 µm electrodes and quads is 0.36 mm and 0.685 mm, while the critical ERD for single 250 µm electrodes is 0.675 mm.
Download figure:
Standard image High-resolution image

2.4. ERD-based image enhancement

Another approach proposed for reducing the effects of ERD is by changing the input signal sent to the prosthesis. The first step is to measure the ERDs of all the electrodes in the array. This can be done by OCT measurements [32]. An ERD matrix should be created accordingly (like figure 1(c)). Subsequently, an ERD correction threshold should be defined, such that the pixels' value in the input image that corresponds to electrodes with ERDs above the ERD correction threshold is increased, and the pixels' value in the input image that corresponds to electrodes with ERDs below the ERD correction threshold is left untouched. Simple point-by-point matrix multiplication between the low-resolution input image (i.e. that fits the size of the electrode array, 10 × 6 for the Argus II) and the correction matrix is then performed. Figure 4 presents the ERD matrix from figure 1(c) and two examples for corresponding correction matrices. Different correction matrices could be helpful in different cases (elaborated in the discussion). In figure 4(b), the ERD correction threshold was defined as the minimum ERD (350 µm) in the ERD matrix. In figure 4(c), the ERD correction threshold was defined as an arbitrary higher value in the ERD matrix—420 µm. Any value can be taken for the ERD correction threshold; this value can be smaller or larger than the minimum ERD.

**Figure 4.** Creation of correction matrix based on the ERD matrix. (a) The ERD matrix from figure 1. (b) and (c) A corresponding correction matrix for ERD correction threshold of 350 µm and 420 µm, respectively.
Download figure:
Standard image High-resolution image

2.5. Image quality assessment

An experiment with subjects and full-reference image similarity metrics were carried out to quantitatively assess the effects of quads and image enhancement on prosthetic image quality. Twenty normally sighted subjects (10 females), aged 20–62 (average: 32.5), were recruited to verify whether quads and image enhancement improve the quality of prosthetic images. The experiment was approved by the Ben-Gurion University Human Research Ethics Committee, and all subjects provided their informed consent. Ten original binary images (figure 5) were simulated using four different methods to illustrate four versions of prosthetic vision: ERD-affected image, ERD-affected image with a quad, ERD-affected image with input-image enhancement, and ERD-affected image with quad and input-image enhancement. The order of the prosthetic views was different for each binary image.

**Figure 5.** The binary images used in the experiment and the image quality metrics: a key, a triangle, a rectangle, an ellipse, a cat, a vase, a car, a cellphone, a shoe, and a basket.
Download figure:
Standard image High-resolution image

The subjects were asked to rank the four prosthetic views according to their similarity to the original image in one session that lasted approximately 10 min (one minute for each image set ranking). The ranking was ordinal between 1 and 4; 1 for the least similar image and 4 for the most similar image. The significance of the experiment (for all the algorithms together) was estimated using the Friedman test, and the multiple comparisons significance between every two algorithms was estimated using the Wilcoxon rank test in SPSS 26.0 software (SPSS Inc., Chicago, IL).

In addition, three full-reference image similarity metrics based on the human visual system (HVS) were used to produce objective quantitative assessments of the similarity between each prosthetic view and its corresponding original image. PSNR-HVS-M (peak signal-to-noise ratio—human vision system—modified) metric is based on the peak signal-to-noise ratio and considers the contrast sensitivity function and between-coefficient contrast masking of discrete cosine transform basis functions [63]. MS-SSIM (multiscale structural similarity) metric is based on the high adaptation of the HVS in extracting structural information from a scene, and it is more robust than the traditional SSIM metric because it combines the SSIM index of several versions of the image at various scales [64]. CW-SSIM (complex-wavelet structural similarity) metric is a unique structural similarity metric based on the idea that distortions lead to consistent phase changes in the local wavelet coefficients, thus not changing the image's structural content [65].

2.6. Object scanning

Scanning an object with slow, patterned head movements can also reduce the perceptual loss of information in the prosthetic FOV caused by the ERD. The head movements allow different electrodes with smaller ERD to convey the information about the scene that the electrodes with the large ERD failed to convey. During object scanning, the phosphenes persist. The phosphene persistence period (PER) is defined as the time it takes a phosphene to fall to the value of the background following stimulation [17]. Generally, users who experience short persistence should scan objects faster than users who experience long persistence [17]. To define the scanning speed in deg s⁻¹, it is necessary to define the simulation's FOV. The FOV is set according to the Argus II FOV—19° × 11° [26] with approximately 13 pixels deg⁻¹. The results demonstrate object scanning at approximately 4.5 deg s⁻¹ without persistence and 9 deg s⁻¹ with and without persistence.

3. Results

For all the simulation results, the pulse duration of 0.45 ms and electrode diameter of 250 µm were used. Thus, single electrodes can evoke phosphenes if the ERD is smaller than 0.675 mm (see figure 2(a)). For ERDs above 0.675 mm, the threshold current required to evoke phosphenes exceeds the safe charge density limit, and single electrodes cannot evoke phosphenes. This is the mitigating case, in which the electrodes are bigger, and the stimulation threshold is higher. For electrodes with a diameter of 200 µm, as in the Argus II, the single-electrode stimulation threshold is 0.36 mm, and thus the threshold current is much lower (see figure 2(a)). It is assumed that the tack is on the left side, and the maximum ERD is at the sixth electrode column.

3.1. The effects of various maximum ERDs

The ERD affects three spatial aspects that change the appearance of phosphenes: size, brightness, and shape. Phosphenes that correspond to electrodes located farther from the retinal surface (large ERDs) appear smaller, darker, and less elongated. The simulated impact of various ERDs_max (0.1, 0.3, 0.5, and 0.7 mm) in an electrode array that contains 60 electrodes is shown in figure 6. The largest ERD_max that was taken is 0.7 mm (figure 6(d)), as found in the literature for the Argus II [32]. For 250 µm electrodes, the ERD_crit is 0.675 mm, and thus there is no phosphene perception in the middle of figure 6(d), where ERD_max is 0.7 mm. Note that for 200 µm electrodes (as in Argus II), these results would be repeated for much smaller ERDs_max (0.36 mm). The simulations presented in figure 6 include spatial effects that the ERD causes, without random spatial variations.

**Figure 6.** The effect of various maximum ERDs on prosthetic vision with 60 electrodes with a diameter of 250 µm. (a) Original image of a zebra crossing. (b)–(d) Prosthetic views of (a) with maximum ERDs of 0.1, 0.4, and 0.7 mm, respectively. As the ERD increases, the electrodes receive weaker stimuli, and the phosphenes appear darker and less elongated. The electrodes cannot elicit phosphenes if the ERD is larger than the critical ERD, which is 0.675 mm, as seen in (d).
Download figure:
Standard image High-resolution image

3.2. Applying electrode grouping to increase threshold current amplitude

The images presented in figure 7 demonstrate how electrode grouping may improve perception mediated by electrodes located far from the retina (i.e. around ERD_crit or above). For 250 µm electrodes, the ERD_crit is 0.675 mm, and thus quads would be beneficial for ERD > 0.675 mm, but even for smaller ERDs, such as 0.6 mm, quads can help in eliciting clearer phosphenes. Note that 200 µm electrodes (as in Argus II) would require quads for ERD > 0.36 mm. The use of 15 quads, as presented in figure 7(b), is unnecessary because it decreases the resolution from 60 to 15, while most of the electrodes can elicit phosphenes without grouping; only a small number of electrodes in the center of the FOV cannot elicit phosphenes because of their large ERDs. Using quads only for specific electrodes with large ERDs could help elicit phosphenes where single electrodes cannot (or barely) do so without decreasing the resolution. The simulations presented in figure 7 do not include the random spatial variations of the phosphenes.

3.3. The combined impact of ERD and other spatial variations

Figure 8 compares the perceptual effects of various random spatial variations, such as size, brightness, dropout, and spatial shifts, that were previously simulated [53–56] with the perceptual effects caused by the ERD. Figure 8(b) presents a naïve prosthetic vision simulation where no spatial aspects are considered. Figure 8(c) illustrates how adding the spatial variations to the prosthetic view decreases the image quality. The perceptual effects caused by the ERD may have a more significant influence on the quality of the prosthetic image than the effects of the random spatial variations, as demonstrated in figure 8(d) in comparison to figure 8(c). In figure 8(e), the use of quads in restoring vision loss is demonstrated again.

3.4. Applying ERD-based input-image processing to reduce vision loss

A reduction in the adverse spatial effects caused by ERD, especially the incapability to evoke phosphenes above ERD_max, is demonstrated by using the correction matrix from figures 4(b) and (c) by considering the ERD matrix presented in figure 1(c). Figure 9 presents an image of a key and its corresponding ERD-affected prosthetic view, with an ERD_max of 0.7 mm and 60 electrodes, each with a diameter of 250 µm.

**Figure 9.** Applying a correction to the input image based on the individual user's ERDs with 60 electrodes with a diameter of 250 µm with a maximum ERD of 0.7 mm. (a-b) An image of a key in grayscale (8-bit) before the multiplication by the correction matrix and its corresponding prosthetic view. (c-d) the key image after the multiplication by the correction matrix from figure 4(c) and its corresponding prosthetic view. (e-f) the key image after the multiplication by the correction matrix from figure 4(b) and its corresponding prosthetic view. The narrow part of the key is not visible in (b) because of the large ERDs associated with the electrodes supposed to represent it. The narrow part of the key is partially visible in (d) and even more visible in (f).
Download figure:
Standard image High-resolution image

3.5. Quantitative assessments based on an experiment with subjects and image quality metrics

Figure 10 presents an example of a binary image and its four prosthetic views. 60 electrodes with an electrode diameter of 250 µm, pulse duration of 0.45 ms, and a constant ERD_max of 0.6 mm (just below ERD_crit = 0.675 mm) were chosen for all the prosthetic views of each binary image. Figure 11 presents a box plot of the central and spread tendencies of the four prosthetic views as found in the experiment. The experiment found significant differences across all the four prosthetic views (Friedman test: χ² = 397.16, P < 0.0001) and between every two algorithms (Wilcoxon rank test: P < 0.0001 for all comparisons). According to figure 11, the subjects' significant preference was for the ERD-affected image with quad and input-image enhancement (median: 4, 25th percentile: 3) and ERD-affected image with input-image enhancement (median: 3, 75th percentile: 4) over the ERD-affected image with a quad (median: 2, 75th percentile: 2.5) and ERD-affected image without improvements (median: 1, 75th percentile: 2). The ERD-affected image with a quad score was sufficiently higher than the ERD-affected image without improvements to be considered helpful in prosthetic vision.

**Figure 11.** A box plot presenting the experimental ranking results. From left to right: the first box represents the ERD-affected prosthetic view without any improvements; the second box represents the ERD-affected prosthetic view with a quad; the third box represents the ERD-affected prosthetic view with input-image enhancement, and the fourth box represents the ERD-affected prosthetic view with quad and input-image enhancement. The ERD-affected image with a quad and input-image enhancement and ERD-affected image with input-image enhancement scores were significantly higher than the other two prosthetic views.
Download figure:
Standard image High-resolution image

Table 2 presents the mean scores of the computational image similarity metrics: PSNR-HVS-M, MS-SSIM, and CW-SSIM. The prosthetic view with both the quad and image enhancement obtained the highest score according to the PSNR-HVS-M and MS-SSIM metrics, and the prosthetic view with only the image enhancement obtained the highest score according to the CW-SSIM metric. These results are in line with the results from the experiment with subjects.

Table 2. Mean scores of various image similarity metrics for the four different prosthetic views from figure 5.

Image similarity metric	Without improvement	Quad	Enhancement	Quad & enhancement
PSNR-HVS-M	7.60 ± 0.95	8.21 ± 1.15	9.04 ± 0.94	9.69 ± 1.17
MS-SSIM	0.60 ± 0.09	0.61 ± 0.1	0.66 ± 0.09	0.67 ± 0.09
CW-SSIM	0.40 ± 0.07	0.49 ± 0.08	0.58 ± 0.07	0.56 ± 0.1

3.6. Object scanning reduces the vision loss caused by the ERD

Figure 12 illustrates the perception of the key during scanning at a fast speed of 9 deg s⁻¹ performed by the user with and without enhancement of the input image and with and without the persistence effect (the videos are provided in the supplementary data (available online at stacks.iop.org/JNE/19/035001/mmedia)). When the user performs head movements to scan an object, the object is observed sequentially in different parts of the FOV, for some of which the electrodes can evoke phosphenes, thus allowing the perception of more parts of the object (figures 12(2(b) and (c))), including those that were missing in the first static image (the narrow part of the key). During the head movements, the object may be perceived with a smearing caused by the persistence effect (figures 12(3(b), (c), (e) and (f))). In the last part of the video, the scanning focused only on the narrow part of the key, helping the user perceive it as it has initially been affected by the ERD, without the smearing caused by the persistence of the wide part of the key. Enhancing the input image using the correction matrix improves the perception of the object by eliciting brighter phosphenes at large ERD locations, without scanning and during the process of scanning (figures 12(d)–(f)). However, as the phosphenes become brighter, the smearing caused by the persistence becomes more visible. In this case, it can be helpful to decrease the scanning speed [17]. Figure 13 illustrates the scanning of the key with 2 s of persistence at two different speeds, 4.5 deg s⁻¹ and 9 deg s⁻¹. It is demonstrated that a slower scanning speed preserves the advantages of improving the object's visibility but also reduces the adverse effect of smearing. In the videos presented in the links provided in the captions of figures 12 and 13, the electrode activation rate was set to 6 frames per second (FPS), according to the Argus II default stimulation rate, which is 6 Hz [45], the display frame rate of the simulation was 30 FPS, and the PER was either 0 or 2 s.

**Figure 13.** The effects of different object scanning speeds in restoring vision loss caused by the ERD with 60 electrodes with a diameter of 250 µm, an electrode activation rate of 10 FPS, a simulation display rate of 30 FPS, with PER of 2 s, and with and without image enhancement using the correction matrix from figure 4(b). Columns: (1) Different frames from *Fast scanning of an enhanced key* and *Slow scanning of an enhanced key* videos (the video can be viewed by clicking on their names). These frames exist in both videos. For example, the 100th frame from the *Fast scanning of an enhanced key* video is equal to the 200th frame from the *Slow scanning of an enhanced key* video, as the scanning speed is reduced by a factor of 2. (2) Corresponding simulated prosthetic views of the video frames presented in (1) for scanning at 9 deg s⁻¹ (the video can be viewed at *Fast scanning of an enhanced key with PER* ). (3) Corresponding simulated prosthetic views of the video frames presented in (1) for scanning at 4.5 deg s⁻¹ (the video can be viewed at *Slow scanning of an enhanced key with PER* ). In (2) compared to (3), the phosphene persistence results in a more visible smearing effect (longer tail of persistent phosphenes) that impairs the shape perception of the object.
Download figure:
Standard image High-resolution image

4. Discussion

This study highlights the importance of ERD in prosthetic vision. As illustrated in figure 8, the ERD effects (especially the vision loss effect) may be more dominant than the effects of the spatial variations illustrated in previous simulations [53, 55, 56, 66, 67]. Therefore, ERD effects should be considered in prosthetic vision simulations, user training, and in the specification of future visual prostheses. By considering the ERD of each electrode in the array, one can anticipate how perception through prosthetic implants may be affected and whether and where to use electrode grouping. Electrodes with an ERD smaller than ERD_crit may evoke hardly visible phosphenes. Thus, electrode grouping should be used for electrodes with an ERD equal to or slightly smaller than the ERD_crit. Generally, using electrode grouping should not be done unless every single electrode in the group cannot evoke phosphenes, as electrode grouping in the form of quads decreases the spatial resolution by up to a factor of 4.

Increasing the safe charge density limit to prevent perception loss caused by large ERDs can be achieved by reducing the pulse duration, lowering the pulse amplitude, and increasing the electrode size. Argus I and II prostheses have been reported to use biphasic pulse durations of 0.45 ms/phase (without interphase delay) [4, 8] and 0.975 ms/phase (with 0.975 ms interphase delay) [10, 41]. These pulse durations result in axonal stimulation associated with the elicitation of elongated phosphenes [8, 68]. It has been found that somatic stimulation requires a lower threshold current compared with axonal stimulation by up to 73% in a model study [69] and by up to 50% in rabbits' retinas [43], which suggests that increasing the pulse duration can help avoid axonal stimulation, as has been proven during in-vitro stimulations of rabbits' retinas [44], as well in reducing the vision loss caused by the ERD.

The electrode radius can help define the desirable ERD. ERDs below the electrode radius are not ideal, as the electric field may be non-uniform, resulting in an unreliable stimulation [42]. It may also cause damage to the retina [70]. ERDs above the electrode radius are also not recommended because of the substantial decay of the electric field [71, 72] and crosstalk between neighboring electrodes [42]. Thus, an ideal ERD is about the size of the electrode radius. Smaller electrodes require lower current amplitude in prostheses [42]. However, the safe charge density limit decreases with the electrode size, and thus the ERD_crit decreases with the electrode size. This suggests that changing the electrode size is not the right approach to reduce the ERD effects. An approach that may reduce the ERD's impact involves using an electrode array with various electrode sizes according to the ERD of each electrode. Such a configuration would allow the stimulation current amplitude for electrodes with a large ERD to be increased without exceeding the safe charge density limit. Another idea involves the development of a multi-depth electrode array (an electrode array in which each electrode has a different depth below the surface of the electrode array tack). In both approaches, the planning of the electrode array should take place after defining in which area of the retina the electrode array will be implanted and executing a pre-surgery OCT measurement of that retinal surface to evaluate the ERDs.

Alternatively, reducing the vision loss effect of the ERD can be done using image processing techniques. As presented in figure 9, enhancing the input image using a correction matrix by increasing the values of pixels according to the ERDs of the electrodes that represent them may reduce vision loss by evoking phosphenes that would not be evoked otherwise. This process is done on the input video before the data is sent to the receiver inside the eye and should be executed one time after the surgery for each subject according to the ERDs of the implant. This method could only work without disadvantages if the pixels that represent information corresponding to the large ERD electrodes do not already represent white objects that require a maximum stimulus (i.e. cannot be further enhanced). In cases where these pixels represent white objects, renormalizing the input image could be of interest, i.e. instead of ranging from 0 (black object) to 1 (white object), it will range from 0 to 0.8, where 0.8 up to 1 is for ERD compensation. However, this may harm the overall image contrast in small ERDs. The ERD correction threshold could be set to a constant value according to the ERD matrix of the user. Alternatively, it could be an adaptive variable that is initially based on the ERD matrix of the user but can be adapted to the input image. For example, it could depend on the brightness of the input image; a higher ERD correction threshold will be used for a brighter input image and vice versa. Using image enhancement could be more useful if combined with scene simplification. Scene simplification through deep-learning and image processing techniques improves the perception in prosthetic vision [53, 73–75].

The prosthetic view with the quad and image enhancement was superior in both the experimental results and the computational image similarity metrics, as shown in figure 11 and table 2. The prosthetic view with only image enhancement was right below it in the experimental results and most image similarity metrics. These two prosthetic views were significantly more similar to the original image than the prosthetic views with the quad and without any improvement (the ERD-affected prosthetic view). Most of the subjects chose the ERD-affected prosthetic image and the prosthetic image with the quad as the worst and debated between the prosthetic views of the enhanced input-image with and without the quad.

The quantitative assessment with subjects and the image similarity metrics were conducted under two strict assumptions. First, the simulated electrode radius was 250 µm, for which ERD_crit (0.675 mm) is much higher than for a radius of 200 µm (0.36 mm). Second, an ERD_max of 0.6 mm was chosen for all the prosthetic views, which is below ERD_crit. Thus, all the electrodes could evoke phosphenes in all the prosthetic views. If ERD_max was chosen to be equal to ERD_crit, the image similarity metrics and probably also the experimental results would show a more significant preference for the prosthetic views with the improvements than the ERD-affected prosthetic view without any improvement.

Incorporating the temporal model [17] in the simulation revealed that the impact of the ERD spatial effects, and specifically the vision loss that results from the incapability of electrodes with large ERDs to evoke phosphenes, could be reduced using head scanning because it allows different electrodes with lower ERDs to temporally deliver the information that the electrodes with the ERD above ERD_crit cannot deliver. Simulating scanning with phosphene persistence revealed that part of the invisible object in figure 9(c) can be observed intermittently and even continuously with phosphene persistence, which may improve perception in similar cases. A combination of the temporal model with image enhancement results in overall better perception of the object by eliciting brighter phosphenes, and specifically with an almost complete perception of the narrow part of the key that was missing from the prosthetic FOV before the scanning, as presented in figure 9(d) compared to figure 9(a). In the same way, scanning may also improve prosthetic perception by temporally delivering the information of defected electrodes.

It has been pointed out that the recommended scanning speed varies among users, and it should depend on the phosphene persistence associated with the user [17]. Figure 13 demonstrates this recommendation. While object scanning at 9 deg s⁻¹ without persistence efficiently reduced the ERD effects and allowed the perception of the whole object intermittently, the same scanning with a PER of 2 s caused a smearing effect. Decreasing the scanning speed by a factor of 2 was equally efficient in reducing the vision loss caused by the ERD but with a reduced smearing effect. Scanning the scene with head movements may be exhaustive for patients and thus cannot be the primary mitigation strategy for the ERD's adverse effects. However, slow head movements can also help the patients with this matter.

The results presented in this study highlight meaningful interactions between the temporal and the spatial aspects of prosthetic vision. The temporal aspects, specifically the persistence and perceptual fading of phosphenes [17, 76], and some of the spatial aspects—the phosphene size and brightness—may interact with each other in a general sense in prosthetic vision. A decrease in a phosphene size or brightness during object scanning or when looking at a dynamic scene may occur when electrodes receive a lower stimulation amplitude (to represent a darker area in the input image) compared with a prior stimulation amplitude [48]. This decrease in size or brightness that corresponds to the spatial information in the scene could be mistaken for a decrease in brightness or size of a phosphene during the persistence or perceptual fading periods [17] and vice versa. In addition, new phosphenes, evoked by head movements or dynamic objects, which are weaker than persistent phosphenes (during the PER) or decayed phosphenes (during the perceptual fading period), may not be visible (i.e. suppressed by the temporal effects).

4.1. Limitations of the simulation

Currently, there is not enough data on the effect of the ERD on phosphene size. In the simulation, we assumed that the size decreases as the ERD increases, but with a limited reduction controlled by ${F_S}\left( {{\text{ERD}}} \right)$ . It has been found that decreasing the stimulation amplitude results in the elicitation of smaller phosphenes [48], and, because the stimulation amplitude decreases as the ERD increases, it could be associated with decreasing the phosphene size. However, distant electrodes may stimulate more receptive fields, resulting in a bigger combined phosphene.

The ERD may increase in time following implantation, as was found in users implanted with suprachoroidal prostheses [39], which means that the ERD effects may get worse after implantation. However, this effect was not demonstrated in this study.

5. Conclusions

The effects of ERD on various spatial aspects were illustrated, both separately and in combination with other spatial effects, on the experience of retinal prosthetic vision. These aspects of phosphenes have not previously been implemented in simulated prosthetic vision studies. The ERD effects were found to have a significant impact on prosthetic vision quality by affecting the size, brightness, and shape of phosphenes, but mainly by increasing the threshold current amplitude, which results in the incapability of electrodes to sufficiently stimulate the retinal cells and leads to the elicitation of phosphenes within the safe charge density limit. Three different approaches were demonstrated and discussed to reduce these adverse effects: electrode grouping, ERD-based input image correction, and persistence-based object scanning. All these three approaches were found to be effective in reducing the impact of ERD. These approaches could be used together to reduce the ERD effects even more. Considering these approaches can help improve the perception of current prostheses by reducing the vision loss effects caused by the ERD. The approaches and conclusions presented in this study may also help design future prostheses; for example, avoiding placement of the electrode array by one retinal tack, developing an electrode array that fits the ERDs, or grouping electrodes in advance according to the ERDs. A visual performance-based experiment with 20 sighted subjects and three different vision-based full-reference image similarity metrics showed that our techniques could reduce adverse ERD effects. The assumptions and simulations provided by this study could be verified by researchers who have access to implanted users.

Acknowledgment

This research was supported by the Israel Science Foundation (Grant No. 1519/20).

Data availability statement

All data that support the findings of this study are included within the article (and any supplementary files (available online at stacks.iop.org/JNE/19/035001/mmedia)).

Conflict of interest

The authors are not aware of any affiliations, memberships, funding, or financial holdings that might be perceived as affecting the objectivity of this manuscript. The authors declare no conflict of interest.

Simulating the perceptual effects of electrode–retina distance in prosthetic vision

Article metrics

Submit

Permissions

Author e-mails

Author affiliations

Author notes

ORCID iDs

Dates

Peer review information

Abstract

1. Introduction