Multi-line Adaptive Perimetry (MAP): A New Procedure for Quantifying Visual Field Integrity for Rapid Assessment of Macular Diseases

Purpose In order to monitor visual defects associated with macular degeneration (MD), we present a new psychophysical assessment called multiline adaptive perimetry (MAP) that measures visual field integrity by simultaneously estimating regions associated with perceptual distortions (metamorphopsia) and visual sensitivity loss (scotoma). Methods We first ran simulations of MAP with a computerized model of a human observer to determine optimal test design characteristics. In experiment 1, predictions of the model were assessed by simulating metamorphopsia with an eye-tracking device with 20 healthy vision participants. In experiment 2, eight patients (16 eyes) with macular disease completed two MAP assessments separated by about 12 weeks, while a subset (10 eyes) also completed repeated Macular Integrity Assessment (MAIA) microperimetry and Amsler grid exams. Results Results revealed strong repeatability of MAP and high accuracy, sensitivity, and specificity (0.89, 0.81, and 0.90, respectively) in classifying patient eyes with severe visual impairment. We also found a significant relationship in terms of the spatial patterns of performance across visual field loci derived from MAP and MAIA microperimetry. However, there was a lack of correspondence between MAP and subjective Amsler grid reports in isolating perceptually distorted regions. Conclusions These results highlight the validity and efficacy of MAP in producing quantitative maps of visual field disturbances, including simultaneous mapping of metamorphopsia and sensitivity impairment. Translational Relevance Future work will be needed to assess applicability of this examination for potential early detection of MD symptoms and/or portable assessment on a home device or computer.

or other characteristics of the observer cohort such as reduced visual field sensitivity. These characteristics were embodied as parameters in the observer model, which we varied to explore a range of individual and population level characteristics in terms of disease status and performance ability. For instance, to simulate a healthier observer with a relatively intact CPM system and mild visual field deficits (e.g. more similar to characteristics of early-stage MD), we would specify a retinal grid with a minor deformation (distortion area = 1 deg diameter), low IDS (0.25 or 25% likelihood of perceiving distortion when the affected area is sampled), a small range of errors due to CPM factors (low sigma of error distribution), and a reasonable level of overall accuracy (90% hit rate). Alternatively, to simulate a less healthy observer with severe limitations to CPM and a larger visual disturbance (e.g. more similar to characteristics of late-stage MD), we would specify a major retinal deformation (distortion area = 3 deg diameter) with high IDS (0.75), high sigma for the CPM error distribution, and relatively lower accuracy (70% hit rate). Thurman et al., 2018 Supplemental File 1

Test Design Optimization
To derive an optimal test design according to simulated observer data, we manipulated two primary test features: (i) the number of lines presented each trial and (ii) the method of spatial sampling, for example, whether line locations were chosen with random uniformity in each trial or according to an adaptive algorithm that takes into account prior responses (Figure 1b). Test design optimization was achieved by simulating thousands of trials with various observer cohorts (defined by a particular set of observer model parameters), and identifying test features that maximized both test efficiency and spatial localization accuracy. Test efficiency was operationalized as the average number of trials it took to reach criterion-level confidence in detecting metamorphopsia, while accuracy was measured by the Euclidean distance error between the peak of estimated location of metamorphopsia and the ground-truth location of the model (i.e. the center of the circular region; see Figure 1a). The overall simulation model contains several parameters that define observer characteristics, as well as features that represented test design characteristics; as a result, it would be impractical to search this entire parameter space. Instead we identified three practical sets of observer cohorts ( Table 1) that spanned a reasonable range of model parameters. We also examined modifications of the test design by systematically incorporating different sets of test features. We specified a total of 8 unique test designs ranging from having 1 up to 4 lines presented per trial and with either adaptive or non-adaptive (random, uniform) spatial sampling ( Table 1).
The most basic case of this psychometric test resembles the PHP test in which a single line is presented each trial, a single response is allowed by the observer, and line locations are chosen randomly, or non-adaptively, from trial to trial (Loewenstein et al., 2003).

Supplemental File 1
The sampling location of lines each trial were determined according to one of two schemes: 1) in the non-adaptive case locations were chosen randomly and independently from a uniform distribution of all possible locations, and 2) in the adaptive case, the probability of sampling a line location was dependent on prior responses, namely the relative frequency of previous FAs across space. The adaptive algorithm was initiated on trial 1 by sampling uniform locations. However as the test progressed, if a line was presented and there was no FA on it (e.g. correct rejection), then this was taken as marginal evidence that there was no metamorphopsia along that line and the probability of sampling that line in the future decreased according to parameter w1. On the other hand, if a line sample did result in a FA, then this was taken as evidence that there might be a metamorphopsia in a region crossed by that line, and the probability of sampling that line in the future increased according to parameter w2. In these simulations, w1=-0.02 and w2=0.1, so the relative influence of a FA on line sampling probability was five times greater than a correct rejection. Following each trial and the application of w1 and w2, the sum of the sampling distribution was normalized across all possible line locations to equal one to derive a probability distribution for sampling lines on subsequent trials. The main idea behind the adaptive algorithm was to increase the likelihood of sampling regions of the visual field that already had evidence for possible metamorphopsia. There is an advantage to sampling suspected regions more frequently as it increases the likelihood of a FA or missed target and evidence will accumulate faster if there is truly an underlying visual disturbance in that region.

Simulation Procedure
The observer model is a non-deterministic model, so we performed 500 independent simulations to obtain a snapshot of average results for each condition (24 total conditions including 3 observer cohorts and 8 different test designs). Each simulation ran trial-by-trial in a manner similar to an actual human observer (as described in Figure  that exceed the threshold criterion are by definition greater than the maximum value of 95% of smoothed maps derived from 10 locations chosen at random. The observer simulation was terminated when the maximum possible trials (240) was reached, or when the two following criteria were reached: 1) at least 8 total FA events had occurred (to make sure enough evidence had accumulated), and 2) the thresholded statistical map of visual field integrity must have a total area equal to or greater than the area of the simulated distortion region. Following termination, we recorded the number of trials it took to reach criteria (as a measure of efficiency; Figure 1c), and measured the Euclidean distance between the center of the "ground truth" region of the simulated observer and the peak location in the statistical map of visual field integrity ( Figure 1d).

Test Efficiency
Mean results of observer model simulations in terms of test efficiency are shown in Figure   2a, with each of three panels corresponding to a particular observer cohort (low, mid or high severity). Two significant trends emerge: 1) test efficiency increased as the number of lines per trial increased, as evidenced by the trend for fewer trials needed on average to reach statistical criteria, and 2) adaptive sampling added a further improvement to test efficiency in comparison to non-adaptive sampling, reflected by the observation that the green markers are consistently lower than the red markers across all conditions. There are also interesting differences among the observer cohorts in terms of test efficiency, in which fewer trials were needed on average to reach criteria for the mid-stage cohort in comparison to the early and late-stage cohorts. This is likely due to an interaction of the IDS parameter (likelihood of producing a false alarm) with the CPM parameter (the standard deviation of spatial localization errors). Since efficiency improvements For adaptive sampling we found that 49-57% fewer trials were needed for the adaptive case to reach criterion-level performance in identifying the underlying metamorphopsia across the three observer cohorts in comparison to the non-adaptive case. In general, better test efficiency translates to quicker assessment of the underlying disease state (or lack thereof), or it can produce a more reliable estimate in the same amount of time as a test with less efficiency. We surmise from these results that using 3 lines/trial could produce a near optimal balance between test efficiency and overall cognitive burden (e.g. the amount of visual attention and memory needed to encode multiple stimuli).

Spatial Accuracy
Mean simulation results for spatial accuracy are shown in Figure 2b. In contrast to test efficiency, we found that the total number of lines/trial was not a factor in determining localization accuracy. Rather, spatial accuracy was influenced predominantly by the degree of cognitive- perceptual-motor errors represented by the CPM parameter of the model. Simulated observers in the low, mid, and high severity cohorts had CPM error sigmas of 1, 1.5, and 2 deg, respectively, corresponding to increases in mean spatial error of 0.49, 1.11, and 1.55 deg. The accuracy of the MAP test in localizing metamorphopsia via analysis of FA clustering is therefore constrained mainly by individual trait-based features reflecting perceptual, cognitive and motor error in executing the task. This finding highlights the importance of effective task training and explanation prior to initializing the actual test, and making sure that participants can make responses to the screen comfortably and reliably to limit errors due to CPM factors.
Together, these simulation results reveal a substantial increase in time efficiency for MAP designs that employ multiple line stimuli per trial, thereby facilitating more behavioral responses per trial, and designs that employ adaptive sampling to hone-in on regions suspected of metamorphopsia. Since there is a reducing marginal benefit to including more than 3 lines, coupled with the cognitive difficulty of encoding and responding to multiple brief stimuli, we hypothesize that an ideal MAP design would have 3 lines per trial and adaptive sampling. We examine this hypothesis directly in Experiment 1 of the manuscript.