Spatio-Temporal Electrode Mapping and Local Interleaved Stimulation Method (STEMLIS) for Artificial Sight Systems

Image processing algorithms play a key role in the development of visual prosthesis systems. In this study, a new electrode stimulation method for retinal implant systems, “Spatio-Temporal Electrode Mapping and Local Interleaved Stimulation (STEMLIS),” is proposed. In this method, the most meaningful spikes that preserve the spatial discrimination in a temporal frame are selected as stimulation data, and phase shifts are applied to all sub electrode groups to minimize undesired electrode interactions. By using this method, both spatial and temporal resolution of stimulation data can be enhanced, since spatio-temporal mapping is used while selecting the stimulation data from a large number of spike data for stimulation of the low-numbered electrode matrix placed on retina surface. The phase delays between the neighboring electrodes are used to introduce interleaved pulses, which minimize the electrical interactions. To evaluate the contribution of the STEMLIS method to perceived image quality, computer based quantitative simulation studies and visual evaluation tests with normal seeing people were performed. For quantitative evaluation, the outputs for the classical method and the STEMLIS method were compared based on the mean squared error (MSE), the histogram similarity ratio (HSR), and edge discrimination parameters. In visual tests, performance of the method was evaluated in terms of contrast discrimination, pattern recognition, text reading and object counting tasks. The subjects reached higher test scores with proposed method and better scores obtained in quantitative comparison. It is concluded that spatio-temporal enhancement of stimulation data can help to improve perceived image quality on visual prostheses.


Introduction
Clinical treatment methods are not available for degenerative retinal diseases like retinitis pigmentosa (RP) or age-related macular degeneration (AMD), which are among the most common causes of blindness worldwide. Replacement of the diseased retina by an electronic system which providing low resolution light perception, called as "visual prosthesis", is an alternative way for sight restoration. The visual prosthesis system conducts visual data for blind people by stimulating the electrode matrix that is placed in the retina or the visual cortex. In this way, these people can partially perform daily activities such as recognition of some objects, unaided navigation, and the reading and writing of large print.
In visual prosthesis systems, stimulation of unimpaired sections of the human visual pathway with suitable electrical current waveforms elicits perceptions of generally rounded spots of light called "phosphenes" [1,2]. The purpose of the visual prosthesis system is to represent visual scenes in the real world as meaningful patterns of phosphenes in the patient's mind.
Depending on the electrode placement, visual prostheses are classified as epiretinal, subretinal, optic nerve, and cortical implants [3]. Generally, these systems consist of 3 main parts: a miniature camera, an image processor, and implant electronics. Although electronic retinal prostheses are still under clinical development and commercial products will be available in a few year, there are some factors affect the performance of these systems. First, the image processing unit is the most important unit since it encodes images as spike patterns and acts as an artificial retina. Second, the electrode interfacing and stimulation methods are also important for generating meaningful image perception in the patient's mind. Even if, it is assumed that the artificial retina unit acts with as high precision as a biological retina; if the electrode-fitting and stimulation methods are inefficient, then the overall performance of the system will be poorer.
Electrical interaction between implant electrodes (or cross-talk) is another disturbing factor to affect the performance of retina implant system. Electrical interaction problem formerly occurred in cochlear implant systems. Today's state-of-the-art cochlear implant systems use interleaved strategies to overcome the channel interactions in the stimulation of the electrodes [20]. In contrast to cochlear prostheses, electrode stimulation methods are more important in retinal implant systems that include many more electrodes (for example 1600 electrode prototype of Retinal Implant AG [4], 232 electrode ASIC system [12] and 60 electrode system of Argus II [21] ) than cochlear prostheses (only 22 electrodes). Relatively high number of electrodes may be needed to provide acceptable image perception in future's developing retina implant systems. Clinical trials with epi-retinal implant users have highlighted that specific electrode stimulation algorithms and electronic systems supporting these algorithms are required for proper stimulation [22]. Furthermore, local spike activities should be carefully analyzed for efficient data transmission to the low-resolution, lowbandwidth microelectrode array. In this respect, highly programmable dense electrode systems are very suitable for implementation of efficient stimulation algorithms [15,16,19,23]. However, in literature, studies concerning these issues are not available yet. Especially there is not any related simulation study that analyzing the effect of the interleaved electrical stimulation on image perception quality. These issues provided the motivation for this study.
In this study, to obtain good performance in retinal prosthesis systems, the spatio-temporal electrode mapping and local interleaved stimulation (STEMLIS) method is proposed. The STEMLIS method uses temporal and spatial processing to select stimulation pulses from ganglion outputs and stimulates the electrodes by interleaved pulse modulated signals to reduce electrode interactions. The superior properties of the method were shown by performing the simulation studies which include quantitative comparisons and visual perception tests with normal seeing subjects. In the simulation study, it was assumed that the visual data to be stimulated were available in the form of encoded spike pulses by the artificial retina model. In this case, by using our custom artificial retina model [24,25], corresponding ganglion outputs previously obtained for test images. The stimulation data were then used in the implemented STEMLIS and classical method, and results of both models were analyzed. In the analysis step, stimulation outputs of the STEMLIS method and the classical method were rendered as phosphene images and quantitatively compared using the mean squared error (MSE), histogram similarity ratio (HSR), and edge accuracy-based parameters. In visual tests, performance of the methods were compared in terms of contrast discrimination, pattern recognition, text reading and object counting tasks which are important for daily visual activities.
The following sections of the paper are organized as follows. Section 2 describes the proposed STEMLIS method. Section 3 presents the simulation studies including visual and quantitative evaluations. Section 4 and section 5 respectively discuss and conclude the study.

Spatio-Temporal Electrode Mapping and Local Interleaved Stimulation (STEMLIS) Method
Depending on the placement area and size, the retina implant systems need many more special electrode stimulation methods to reduce the electrical interactions of their high density electrode matrix than other implanted electrode arrays. While cochlear prostheses consist of fewer electrodes in a relatively short cable (1-2 cm in cable length), next generation retinal prostheses should be consist of at least 1000 electrodes in a very small area (about 1-2 mm 2 ). Besides the interaction-free stimulation, the selection of correct stimulation data is also important for creating meaningful image perception for implant users. To deal with these issues, the proposed method has 2 main steps. The first step is the spatio-temporal mapping of the stimulation data to the electrode and the second step is the local interleaved stimulation to reduce the electrode interaction in the retinal electrode matrix. These 2 steps are described in the following sections. A block diagram of the proposed STEMLIS method is shown in Figure 1.

Spatio-temporal mapping of the stimulation data to the electrode matrix
In primate retina, approximately one million ganglion cells are responsible for the transmission of image information to the visual cortex in the form of encoded firing activities. Electrode numbers in today's retinal implant systems are rather low to meet the needs of high resolution stimulation. However, these electrode numbers provide minimal requirements for the reading of large print and navigation [26]. Interfacing the high resolution image data to the low resolution electrode matrix is a challenging task. Generally, the stimulation data are directly selected from images by down-sampling methods. In this way, the produced ganglion outputs in retinal encoder models are fitted to the electrode matrix size, but this approach causes the loss of much detail in the original image data due to coarse down-sampling. Some methods select only the outputs with corresponding electrode locations [27,28]. Other methods utilize resizing functions for the images in order to match the ganglion output dimension to the electrode matrix size [6,10,29]. Other image processing approaches including sophisticated methods can be found in reference [30]. Clearly, these types of applications that based on static image stimulation result in the loss of temporal and spatial information in the ganglion codes.
There are widely accepted two approaches exist in coding mechanism of the retinal ganglion cells. The first is rate coding and second is temporal coding [31]. Undoubtedly, coding mechanism of the ganglion cells has major function in the image forming in visual cortex. So, in our opinion, to optimize the stimulation data selection process, spatio-temporal changes of ganglion firing activities should be analyzed. Thus, in the first stage of the STEMLIS method, we propose a process step for spatio-temporal selection of stimulation data. In this step, the high resolution retinal ganglion cell layer is interfaced to the relatively low resolution electrode matrix (the electrode matrix size must be less than the ganglion outputs so that data selection can be applied) by selecting the spatially and temporally meaningful ganglion output from the local neighborhood of the electrodes along with temporal frame processing.
The spatio-temporal data selection process is illustrated in Figure  2. In this figure G(x,y) is M G × N G ganglion matrix including ganglion outputs (spikes, 0 or 1), E(i,j) is M E × N E electrode matrix, K is spatial window dimension and it satisfies this equation: In the first step of the algorithm, temporal sum values of ganglion matrix for a time t is obtained in a temporal range t+ τ according to Eq. 2. Where τ represents the temporal frame count for temporal analysis of ganglion outputs, x and y are spatial coordinates of ganglion outputs. For all x and y values (x=1,.., M G , y=1,..,N G ) temporal summation process is performed.
By using temporal sum values (G sum ) of ganglion output matrix, the ganglion outputs in a K × K electrode neighborhood are analyzed and the spatially most meaningful ganglion output is obtained according to Eq.3. Where x' and y' are spatial coordinates in K × K window. S gx and S gy represent x and y coordinates of meaningful ganglion output in the K × K window. This process is done for each electrode in the electrode matrix (g x = 1,..,M E × N E , g y = 1,..,M E × N E ).
After this step, for each electrode, selected ganglion outputs for a temporal frame sequence is mapped (assigned) to the electrode matrix (Eq. 4).
( , ) ( , ) 1, , , 1, , Since it is assumed that high resolution ganglion outputs (spikes) are available, in this study, we generated and stored ganglion outputs for some test images (from the Berkeley image database [32] and standard images) by implementing our artificial retina model. In the artificial retina model, the output size of the ganglion layer was determined according to image dimensions and their receptive field sizes and overlaps. This provides a ganglion output layer of various sizes. Spatial mapping parameters are automatically determined from the size of both the electrode matrix and the ganglion output layer. For example, if we want to map the 100 × 100 pixel ganglion layer to the 20 × 20 electrode matrix, the local neighborhood of each electrode should be 5 × 5, and 25 ganglion outputs in this neighborhood should be temporally analyzed.
This method is effective because it provides for transmission of adaptive data mapping using limited electrode numbers. Furthermore, this method provides better spatial resolution than standard downsampling methods by updating the selected ganglion outputs temporally.

Generation of local interleaved stimulation pulses
Phosphene profiles elicited by electrical stimulation are highly related to stimulation strength and stimulation duration [33][34][35]. The elicited phosphene size increases with stimulation strength. Similarly, if stimulation frequency increases, brighter phosphene profiles are elicited. The retinal implant systems proposed by several research groups offer highly programmable hardware designs that are capable of setting each parameter, such as the stimulation current level, stimulation frequency, and pulse type (biphasic or monophasic). In the retina, however, color tones are identified by spike latencies (or frequency), not by spike amplitudes. Thus, in this study, it was assumed that all of the stimulation pulses obtained from the artificial retina model had the same strength or amplitude.
In electronic implant systems, electrode interactions decrease the performance of the system. To overcome this problem, interleaved stimulation strategies are widely used in cochlear implant systems [20]. For retinal implant systems, the interleaved stimulation strategy can also be useful for reducing undesired electrical interactions between the electrodes. On the other hand, this approach must be redesigned for the 2D high resolution electrode matrix in retinal implant systems. In classical methods, interleaved pulses are generated by applying phase delays to the consequent electrodes [5,18]. If phase delays are applied for all electrodes in the implant matrix, there will be too many phase delays, equal to the electrode number, and this approach will not feasible for real-time applications. In our method, a new electrode stimulation strategy based on local interleaved pulse generation is developed to overcome the channel interaction problem; it requires only P × P-1 phase delays (P is the neighboring electrode number for the interleaved pulse sequence) suitable for real-time applications ( Figure 3).
The proposed local interleaved pulse generation method is shown in Figure 3. In this method, for 2 × 2 local electrode neighborhoods, a minimum of only 3 phase delays (D1-D3) are required for interactionfree stimulation of the electrodes. However, this method is capable of generating the desired phase shift for local electrode numbers (for 3 × 3 electrode neighborhoods, it generates 8 phase delays). In Figure  3, a subgroup of electrodes (E 11 -E 22 ) is illustrated (Figure 3a). In the stimulation sequence (Figure 3b), the first electrode in the group (E 11 ) is stimulated first and the neighboring electrodes are stimulated sequentially (E 12 , E 21 , E 22 ). This stimulation strategy is applied to all subgroups in the retinal electrode matrix ([E 13 , E 14 , E 23 Determination of the stimulation order can be generalized for an P' × P' sub electrode group by using the following algorithm. Here, (r,c) represents row and column vectors of the electrodes in the M × N electrode matrix, ord is the stimulation order (ord = 1,2,..P × P), O ij is the stimulation order matrix for the P × P neighborhood, and E rc is the stimulation order of electrodes in (r,c) location.

Simulation Results
In the simulation studies, the test images were processed by using our custom artificial retina model [24,25] and ganglion cell outputs for these images to be used as input data for STEMLIS and classical method were obtained. This section is organized into three subsections. Following sub-section describes the simulation method for quantitative comparison. The second sub-section presents quantitative results in terms of some error measures. The last sub-section presents the test results for normal seeing subjects.

Description of simulation method for quantitative comparison
Since the main aim of this method is to provide meaningful image perception for the mapping of the high resolution data to a low resolution electrode matrix, the input image dimension and artificial retina output dimension were selected as equal to the input image dimensions. The electrode neighborhood was selected as 5 × 5 pixels in the spatial dimension and 5 frames in the temporal dimension for the spatio-temporal electrode mapping stage of the STEMLIS method. Mapped electrode matrix size was determined depending on electrode neighborhood (Table 1). For the local interleaved stimulation stage, a 2 × 2 sub electrode group was selected. By using the artificial retina model, the spike outputs for ganglion cells were obtained with a time step of 1 ms, and a sequence of 100 frames was stored. This stored spike data was used to measure the performance of the STEMLIS method and the classical method.
To analyze the performance of the method, obtained stimulation pulses must be simulated as light perceptions, like phosphenes. The electrical stimulation of the retina layer with microelectrodes causes perceptions of phosphenes, and it is known that the intensity of electric fields decreases as a function of distance from the electrode [36,37].
A phosphene profile is generally defined as a 2D Gaussian distribution, as in Eq. 5 [29]. Here, σ x and σ y represent the bandwidth of the Gaussian distribution along the x/y plane and σ is the standard deviation value of the distribution. For each phosphene profile, σ x and σ y are selected as 5 pixels and σ is selected as 1.2. These values are sufficient to simulate phosphene profiles for the defined electrode neighborhood in this study. In this way, phosphene perceptions that are elicited by electrical stimulation can be simulated.
The phosphene profiles were created for each stimulated electrode using Eq. 5 for each interval of 1 ms. The calculated phosphene profiles were summed with different weights over the last 5 ms (or last 5 frames) to simulate charge accumulation in the ganglion cells according to Eq. 6.
( , ) Here, P ij is the elicited phosphene profile at time t, τ is the temporal  time for calculation of charge accumulation over the ganglion cells ( = 5), and E ij is the temporal stimulation matrix of the electrode, containing 1 or 0 values to represent whether the stimulation exists or not. G(x,y) is the Gaussian profile defined in Eq. 4 and w k is a weight array that gives greater weight to the present stimulation frame and less weight to the other frames. In this study, for the weighted sum of Gaussian profiles, we used an exponentially decreasing function (w k = e -k/τ ).
In the evaluation of visual perception quality for visually impaired persons, some clinical tests are performed, such as object recognition, determination of the direction of moving objects, object counting, and large print reading. Besides the technological limitations, these tests are still insufficient for evaluating the performance of image processing and electrode stimulation strategies. Evaluation methods based on computer simulations are likewise inefficient for measuring the quality factor of these methods. However, results of computerbased simulations can be analyzed by considering tests conducted with humans. This can thus be an effective way to analyze the resulting phosphene images, by considering edges with respect to the original image frame. Since the original image is a higher dimensional matrix in this study, making a comparison by using histogram similarities for the resulting images is more suitable than a MSE-based comparison. Histogram similarity is a measure of the intensity distribution likelihood of two images, and it is calculated as shown in Eq. 7, where p(i) and q(i) are normalized histogram distributions of the original and reconstructed images, respectively.
To compare the STEMLIS method with the classical method, results for classical electrode mapping and the stimulation method were obtained. In the classical approach, the original image (or ganglion activity layer) is down-sampled to the electrode matrix size by using interpolation-based resolution reduction algorithms and each electrode is synchronously stimulated. In this way, phosphene based output images for the classical method were obtained.
The test images used in the quantitative simulation study are shown in Figure 5. Four of them were selected from the Berkeley image database and the other 2 are well-known images [32]. The colored images from the Berkeley database were converted into intensity images, as the artificial retina model only accepts intensity images. These images include textures, smooth regions, various contrast levels, varying intensities, strong and weak edges, and texts. Using these images, the STEMLIS and classical methods were compared in terms of MSE, HSR, and edge discrimination parameters. For all comparisons, the obtained ganglion activities for 30 ms of duration were used for the phosphene-based reconstruction of images. Phosphene-based reconstruction of an image using the STEMLIS method is shown in Figure 6, with the reconstructed image given for stimulation frames 5, 8, 11, 14, 18, and 22. As can be seen, as the spike frame is increased, the details in the image are significantly perceived.
In the MSE-based comparison, MSE values were calculated between the phosphene-based reconstructed images and the original images. Similarly, HSR values were calculated by considering the similarity of the phosphene images and the original images. Edge-based comparison was performed using the canny edge detection method. Edges of original and phosphene images are detected by a canny edge detector and error values belonging to incorrectly detected edge pixel (IDEP) numbers are calculated, with the resulting edge detected images subtracted from edge detection result of original image and errors were summed. Since the electrode matrix dimension is less than that of the original image, for comparison purposes, the original images were resized to the electrode matrix dimension and the excess image dimensions were cropped.  the STEMLIS method. For edge discrimination parameter, in average, proposed method has 136.5 lower IDEP value than classical method. In Table 1, the original image dimensions and electrode matrix dimensions are provided (Figures 6-8).

Tests with normal seeing subjects
To evaluate the performance of the method in terms of some visual perception tests such as contrast discrimination, pattern recognition, text reading and object counting, special dataset is used for testing these activities. To evaluate the contrast discrimination performance with the method, an image which including 7 vertical bars that each of them has different grey level tone was used (Figure 9a). To evaluate pattern recognition performance, an image consist of 8 different textured regions was used (Figure 9b). Text reading performance was evaluated by using a test image which including a test word on noisy background (Figure 9c). Object counting performance was tested by using a test image including 12 objects with several grey tone levels ( Figure 9d). The test images are shown in Figure 9. The size of all images in the dataset is 130×220.
The test images were processed as described in section 3.1 and output images were obtained for STEMLIS and classical method  including three different electrode resolutions; 26×44, 43×73 and 130×220. 15 normal seeing subjects aged between 23 and 35 (26.6 ± 3.45) were joined into this experiment. The test images randomly (sometimes STEMLIS method first, sometimes classical method first) presented by using LED monitor in same light conditions. The distance between subject and monitor is 50 cm. Processed images were presented by zooming to 260×440 dimensions. The test images were never seen before by any of the test subject and presentation order of images starts with lowest (26×44) to highest (130×220) resolution. In contrast discrimination test, for each image resolution, contrast levels in the image were seen by subjects were asked and answers were recorded. In the pattern recognition test, for each resolution, subjects were asked to answer that how many different textured regions exist in image. In text reading based test, subjects were asked to read test word in the image and to rate the ease of reading. In the object counting test, subjects were asked to indicate the number of objects they can see. Original image and reconstructed images for STEMLIS and classical method are given in Figure 10. Average percentage score values for test results obtained for three resolution settings are presented in Figure  11 graphically. Dimension of the analysis window is automatically adjusted to different resolutions. The 26×44 electrode resolution creates 5×5 analysis window, 43×73 electrode resolution creates 3×3 analysis window for STEMLIS method. For 130×220 electrode resolution setting there is no analysis window (data selection step is bypassed, K×K = 1×1) for STEMLIS method and in this case these two methods differ only in terms of interleaved stimulation strategy. In order to establish a relationship between visual and quantitative evaluation, for the test images, the obtained images for STEMLIS and classical method were analyzed in terms of MSE, HSR and IDEP parameters as in section 3.2. The quantitative results for these parameters are given in Table 2.
The graphics in Figure 11 show the percentage of averaged test scores for 15 subjects. Error bars in the graphics show the maximum and minimum test scores for each series. In the graphics in Figure  11, averaged scores of each test converted to percentage values by using maximum contrast level (7), maximum number of patterns (8) and maximum number of object (12) in the test images. In the text    (Figure 10d), at low resolutions both methods failed, hence graphical result was not obtained for this test. However, at high resolution, subjects were able to read the test word for both methods, result of the STEMLIS method was found better than standard method's result in terms of ease of reading. This result is compatible with the quantitative results in Table 2 (Test 3 at 130×220 resolution). The higher HSR and lower IDEP value may be contribute to perception of the text in the image more perceivable.
In pattern recognition test, 8.33%, 4.16% and 14.1% higher test scores were obtained than standard method at 26×44, 43×73 and 130×220 electrode resolutions respectively (Figure 11a). In pattern recognition task, edge and pixel based similarities is a bit more important than HSR parameter. From Table 2, "Test 1", the HSR values are relatively high for STEMLIS method. Although relatively low MSE value was obtained for only 43×73 resolution and relatively low IDEP value was obtained for 130×220 resolution, for all resolutions the HSR value was relatively higher than standard method. In some cases, there are no significant MSE, HSR and IDEP values, proposed STEMLIS method still yields better performances in terms of visual perception. Results for this test are shown in Figure 10a. As it can be seen from this figure, the different textured regions are fused due to continuous stimulation effect in standard method.
In contrast discrimination test, 8.6%, 14.2% and 10.5% higher test scores than standard method were obtained for 26×44, 43×73 and 130×220 resolutions respectively (Figure 11b). Beside visual tests, considering quantitative results in Table 2, "Test 2", for each resolution, STEMLIS method yielded higher performance for MSE and HSR parameters. Only for 130×220 electrode resolution, STEMLIS method obtained very low IDEP value than standard method. In Figure  10b, it can be seen that even at low resolutions, edge detection results for STEMLIS method are very similar to the original. Rather than edge based similarity, this result highlighted that histogram distribution and pixel based similarity are more important for contrast perception.
In object counting test, 5.55%, 11.1% and 16.1% higher test scores were obtained than standard method for 26×44, 43×73 and 130×220 electrode resolutions respectively (Figure 11c). With increasing electrode resolution, STEMLIS method produced more IDEP error, but at each resolution, scores for HSR parameter are relatively higher than standard method. For MSE parameter in Table 2, "Test 4", except for lowest resolution, relatively lower MSE values were obtained at other resolutions. From this result, it can be understood that higher MSE and HSR values improves the perception of the objects in the image. It should be noted that, at high resolution error range for standard method expanded.
With increased resolution, for each test image, while the MSE values decrease, contrary, IDEP and HSR values increase, however, visual performance of the both methods shows an approximately linear increment. From these results, although it was difficult to accurately determine the relation between MSE, HSR, IDEP parameters and visual perception performance, for object counting, contrast discrimination, text reading and pattern recognition scores can be high when low MSE and high HSR values obtained. Overall performances of the methods in terms of quantitative results for visual test images are given as in Figure 12.

Discussion
In this section the study is discussed in terms of motivation, evaluation method, hardware requirements, and simulation and test results.

Evaluation of simulation results
To evaluate the performance of the STEMLIS method in terms of image quality, obtained image quality is measured based on some statistical parameters widely used in image quality comparison studies. In the quantitative comparison studies proposed method introduced, in average, 519.72 lower MSE value, 3% higher HSR value and 136.5 lower IDEP value than classical method. The images used in quantitative comparison are selected to be highly complex natural images so that the quality of the resulting images can be compared objectively.
However only quantitative results cannot reflect the real performance of the method, it can be gives an idea about the improvement. Then, in addition to quantitative comparison study, in order to test the visual performance of the method for some perception/ recognition tasks, visual experiments with normal seeing subjects were carried out.
To evaluate the contribution of the method in terms of visual perception quality the visual tests with normal seeing people were carried out. In the tests, relatively less complex synthetic test images selected to evaluate the performance of the method in terms of some important daily visual activities which are contrast discrimination, pattern recognition, text reading and object counting. For three resolution settings (26×44, 43×73, 130×220) correspond to (3×3, 5×5 and no window) analysis window dimension, were used. With the proposed method, in average for all test, participants reached 7.5%, 9.9% and 13.6% higher scores at 26×44, 43×73 and 130×220 electrode resolutions respectively. The average performance values for the each test; the best results for contrast discrimination test 14.2%, for object counting test 16.1% and, for pattern recognition test 14.1% higher scores were obtained than classical method. Therefore in both methods, subjects were able to read test word only in 130×220 electrode resolution setting when MSE values become lower than 1300. In addition to this, subjects said that reading is more comfortable in STEMLIS method. The test images were also analyzed by using quantitative comparison for both methods. For some resolution settings, MSE and IDEP values are relatively higher than standard method, however, in average, proposed method yielded relatively good scores ( Table 2, overall values and Figure 12). The fact that obtaining higher error values for some images is highly related to image complexity also (e.g.Test image 3). Although it was difficult to establish a mathematical relation between quantitative results and visual test scores, the results for quantitative and visual experiments are consistent with each other.
In quantitative tests, overall performance of the proposed method is relatively higher than classical method. For some complex images, which include many of edges and textures (e.g. Figure 7 test image and Figures 10a and 10d), error values were higher than error values of other simpler image. In the visual perception test, especially in the pattern recognition test, this error value does not cause a decrease in performance of method (Figure11a).
As it can be seen from Figures 7 and 10, spatial details and grey tone values can be detected well in the temporal stimulation duration with interleaved stimulation.

Effect of interleaved stimulation
The data selection step of our method is bypassed (analysis window dimension K×K = 1×1) in visual evaluation tests due to electrode resolution and dimensions of test images are the same. In this case electrodes are one-to-one mapped with the ganglion outputs of artificial retina model. STEMLIS and classical method differs only in terms of interleaved stimulation strategy. The results for this setting show that significant improvement was obtained by using only interleaved stimulation. This result highlighted that interleaved stimulation can improve the perceptual image quality for retina implant systems. The interleaved stimulation strategy also allows the use of various ranges of stimulation current to improve dynamic contrast levels of the patients. This property can be important for patients whose retina's electrical threshold levels are not in standard range due to damage levels of retinal disease.

Performance of the method at various electrode resolutions
According to visual test results, for both methods, increasing performance was obtained with regard to increasing electrode resolution. The quantitative results show that increased electrode resolution causes high error values for complex images. However, high electrode resolution provides high test scores for all visual tests. For two lowest electrode resolution settings (26×44 and 43×73) in the text reading test, the test word was not recognized with both methods.
Due to the MSE measure depends on difference of pixel values in [0-255] interval, unimportant changes (e.g. little changes in light conditions) for visual perception may cause important error values for this measure. Therefore it can be considered that HSR and IDEP measures may coincide with visual perception, due to these measures have relatively small variation range.

Is the artificial retina model needed for retina implant systems?
As in cochlear implant systems, since the implementation of the artificial retina model is important, the method in this study primarily developed for use with an artificial retina model. In this way, in the study, ganglion activities obtained from our artificial retina model were used. Since artificial realization of retina potentially holds an important place for sight restoration studies, the aim here is to develop a useful method for requirements of the retina implant systems of future. In this manner, CORTIVIS project that is based on an artificial retina like processing can be given as a good example for this [38].
The tone differences in visual images are encoded by firings of ganglion cells. For some special visual task, population codes of ganglion cell are responsible for visual perception. If obtained stimulation signals out of image processing method become so close the signals of original retinal ganglion cells, naturally, the performance of the system will improve. On contrary, for the simpler image processing algorithms (edge detection, static stimulation based, etc) which do not consider the working principles of retina, the same thing cannot be said. In this case, by the use of look-up tables and stimulation strategies, grey tone intervals in the images can be expressed as just few levels. Furthermore, this may not be achieved unless interleaved stimulation was not used. By using artificial retina model, perceived contrast levels can be increased. From Figures 10b and 11b, it is seen that contrast discrimination scores became better with interleaved stimulation.
There is also another factor for obtaining high performance from the retinal prostheses systems, the location of electrode placement is very important. Generally macula region is suitable place for electrode placement and ganglion cells in this region have similar properties. By considering this structure, the artificial retina model should be firstly adopted to generate the stimulation signals for the cell types in this region. In this study, for obtaining the simulation results, the outputs for sustained types of ganglion cells were used.
Simulation studies in this paper are based on the assumptions that electrical stimulations causes a phosphene like light perception in retina. In fact the retinal circuitry may not work synchronous to the electrical stimulation. Furthermore, the one electrical stimulation pulse may even cause series of firing in a ganglion cell. However the same situation exists for spiral ganglion cells in the cochlea, cochlear prosthesis users reaches high speech recognition performance with these systems. Although the ganglion cell density in cochlea is greatly lower than ganglion cell density in the retina, interleaved stimulation approach is formerly proposed for cochlear prosthesis and this method made good improvement on the performance of the cochlear prosthesis.
Obtained phosphene image quality also depends on the artificial retina model, as the obtained firing activities are directly related to the spatial content of the images. Retina modeling is still an incomplete subject of research, since our knowledge about how the retina works is still fairly limited. However, in this study, the same firing activities obtained from our retina model were used for performance comparison of the STEMLIS method and the classical method. Graphical results in Figure 11 showed that the proposed STEMLIS method provides an image perception more similar to the original image. Furthermore, according to the MSE, HSR, and IDEP parameters, the proposed method provides improved image quality compared to the classical method.

Determining real performance of the method
Rather than quantitative results, visual perception tests are closer to show the real performance of the method. In the test with normal seeing people, good discrimination and recognition scores were obtained for three resolution settings. However, it should be noted that the real performance of the method can be evaluated only by performing clinical experiments with real implant recipients. The simulation results presented in this study become a measure for theoretical performance comparison. In the clinical experiments, as in cochlear implant systems, with real implant subjects many of parameters can affect the real performance of the method, such as age, damage level of retina, education and implantation time. Therefore carrying out this type of clinical experiments requires studies which exacting studies for long time period. In this study, performance of the method was presented according to objective image quality measurement criterion, such as MSE, HSR and IDEP criterion, beside subjective visual perception tests with normal seeing participants in order to develop optimal time usage for performance evaluation.

Hardware requirement for real-time implementation
The proposed method in this study is capable of stimulating the microelectrode matrix for both synchronous and interleaved stimulation with the desired phase shifts. The increase of phase shifts in the stimulation means that the visual stimulation pulses are sent with a time delay proportional to these phase shifts. The spatiotemporal electrode mapping method also requires an extra time delay, since it uses temporal processing to select the most active ganglion output from artificial retina model. In this study, 5 temporal frames for electrode mapping and 3 phase shifts for interleaved stimulation of the electrodes were used. Even though this approach may seems timeconsuming at first, in state-of-the-art cochlear implant systems, similar algorithms for electrode stimulation and data selection have long been used successfully. Although the MPEAK, CIS, and ACE methods use approximately similar approaches to electrode stimulation, they do not cause extensive processing burdens to delay for speech perception. Similarly, it is thought that sending the stimulation data with a time delay of a few milliseconds would not cause significant time delays for visual perception and understanding of visual scenes. In addition to this, some silicon artificial retina chip development studies can be found in literature. Moreover, for low electrode resolutions (up to 64×64) proposed method can run in real time speed. Especially, by using parallel processor (FPGA) instead of conventional DSP systems, the method can work very fast and efficient. In addition, developments in microprocessor and Application Specific Integrated Circuit (ASIC) technologies new higher performance chips than today's chips will be developed for retina implant systems.

Conclusion
In this paper, the STEMLIS method, which is based on spatiotemporal electrode mapping and local interleaved stimulation, was proposed for retinal implant systems to improve both the spatial and temporal image perception quality. This method acts as an interface for mapping the stimulation data from high resolution ganglion outputs to the low resolution implant electrodes and uses phase shifts to generate interleaved pulses for interaction-free electrode stimulation for neighboring electrodes.
By considering quantitative and visual test studies, it is concluded that the STEMLIS method is useful for stimulation of electrode matrix and can contribute to the improvement of visual prosthesis systems, especially for retina implant systems. By use of this algorithm in the state-of-the-art retinal implant systems, perception quality of the implant users can be improved.