Intergrader Agreement of Foveal Cone Topography Measured Intergrader Agreement of Foveal Cone Topography Measured Using Adaptive Optics scanning Light Ophthalmoscopy Using Adaptive Optics scanning Light Ophthalmoscopy

The foveal cone mosaic can be directly visualized using adaptive optics scanning light ophthalmoscopy (AOSLO). Previous studies in individuals with normal vision report wide variability in the topography of the foveal cone mosaic, especially the value of peak cone density (PCD). While these studies often involve a human grader, there have been no studies examining intergrader reproducibility of foveal cone mosaic metrics. Here we re-analyzed published AOSLO foveal cone images from 44 individuals to assess the relationship between the cone density centroid (CDC) location and the location of PCD. Across 5 graders with variable experience, we found a measurement error of 11.7% in PCD estimates and higher intergrader reproducibility of CDC location compared to PCD location (p < 0.0001). These estimates of measurement error can be used in future studies of the foveal cone mosaic, and our results support use of the CDC location as a more reproducible anchor for cross-modality analyses.


Introduction
Adaptive optics scanning light ophthalmoscopy (AOSLO) allows routine, non-invasive visualization of the photoreceptor mosaic [1].The clinical utility of AOSLO imaging is driven by the ability to extract quantitative metrics of the mosaic such as cell density, cell spacing and packing regularity [2,3].Measurement error can be introduced by differing conditions of image acquisition, processing and analyses.Liu et al [4] showed that 2.5-6.9% of variation in parafoveal cone density measures was due to inter-device differences.The sampling strategy to extract cone metrics has the potential to contribute significantly to measurement errors as shown by studies examining different sizes, shapes and orientations of the window used to assess and calculate quantitative cone metrics [5,6].Measurement errors are modality dependent, with lower errors outside the fovea on split detection images in normal individuals [7] and diseased eyes [8].Despite the established utility and rise in number and specificity of automated algorithms [9][10][11] manual correction of cone coordinates output by algorithms is still routinely practiced.Intergrader differences in this manual step are known to contribute to measurement error of quantitative metrics of the photoreceptor mosaic [4,[6][7][8][12][13][14], but our understanding of these errors to date has been based on studies of parafoveal retinal locations.
Recent advances in AOSLO have facilitated resolution of the most densely packed foveal cones in normal individuals [15][16][17][18][19].Some features of the photoreceptor mosaic at the location of peak density increase grader disagreement on cone identification.These features include: 1) interference between tightly packed cones making them difficult to distinguish from one another; 2) non-uniform reflectance profiles [20], as demonstrated in Fig. 1.Given these materially different features of the cone mosaic within the fovea we sought to examine intergrader agreement and measurement error of peak cone density (PCD) estimates.The PCD location is often used to designate the center of the fovea, and therefore used to calculate retinal eccentricities, meaning errors in its location could lead to variation in localization of more peripheral regions of interest.Therefore, we also examined intergrader variation in the PCD location within the foveal region and compared it to intergrader variation in the location of the weighted centroid of the area containing the highest 20% cone density values in the foveal region, a landmark described by Reiniger et al. [19].We set out to do this using the data set from a prior study that examined the interocular symmetry of the foveal cone mosaic [18].Fig. 1.Intergrader differences in cone identification.20 × 20 µm foveal regions of interest (ROIs) from one image (that of JC_11591) demonstrating intergrader differences in cone identification.Good agreement can be seen where cells are brightly reflective and clearly circumscribed.In areas with low reflectivity there is not always agreement on the presence of a cone, and disagreement on the location of cones can be seen in larger regions of more diffuse reflectivity -where one cell or two may be represented by the profile.

Data
This study followed the tenets of the Declaration of Helsinki and was approved by the Medical College of Wisconsin Institutional Review Board (PRO30741).Foveal images of the right eye of 44 individuals imaged as part of a previous study by Cava et al [18] were analyzed.The confocal AOSLO acquisition protocol used by Cava et al was specifically modified to facilitate visualization of the foveal cones.Modifications included use of a 680 nm light source for imaging and a sub-Airy disc pinhole (0.5-0.7 Airy disc diameter).Individuals were aged 12-61 (16 males, 28 females), and had 300 × 300 µm regions of interest (ROIs) subjectively inclusive of the area of peak density cropped from foveal montages for counting to create the data set was analyzed in the current study.

Cone density analysis
Within these ROIs, cones were identified using a semi-automated algorithm (Mosaic Analytics, Translational Imaging Innovations, Hickory, NC) with manual correction by each of five graders (NW-31yrs, HH-27yrs, JAC-30yrs, JC-45yrs, and AS-24yrs).All graders had visual acuity with correction of 20/20 or better.Before undertaking cone counting for this study all graders completed a standardized training protocol, where they were required to reach the threshold of <6% intra-grader reliability when grading the 90 confocal images from within 1 degree of the fovea described by Liu et al [4].Cone coordinate files of the present study were then used to generate cone density maps using a sum approach with a square window that increases in size fractionally until it includes 150 cells with Voronoi domains whose vertices were fully contained within the 300 × 300 µm ROI (i.e., non-edge cells).A pixelby-pixel estimate of density was generated, based on the average density of the windows overlapping each pixel using a modified version of the script found here (https://github.com/Eurybiadan/Metricks). Two landmarks within the topographical map of cone density were identified within these density matrices: 1) the PCD location and 2) the CDC location, which is described by [19] as weighted centroid of all density values within the 80th percentile isodensity contour (using the MATLAB function regionprops with the 'WeightedCentroid' argument).An example of density maps produced from each grader with overlaid isodensity contours, PCD and CDC locations are shown in Fig. 2.

Statistics
The intergrader repeatability and measurement error of PCD estimates were assessed as outlined by [21].Briefly: within image standard deviation was calculated, and then squared to give variance for each image.The square root of the average variance for all 44 images gives S w , which was multiplied by 2.77 to estimate the repeatability, while the measurement error is defined as S w times 1.96.The intergrader intraclass correlation coefficient (ICC) was calculated (ICCest, ICC 2.3; R Version 1.4.3).A variance components model was used to explore the contribution of image and grader to overall variability in PCD estimate.A linear regression model with random effects only was used to estimate the variance components and resampling with 1000 repetitions generated 95% confidence intervals.A one-way ANOVA was performed to compare the effect of grader on PCD estimate (Graph Pad Prism Version 9.0.0).The intergrader variability in PCD and CDC locations respectively were assessed by calculating the confidence ellipse for the 2D data points generated by the five graders using a customized MATLAB script (2019b, Mathworks, Natick MA).The ellipse defines the region expected with 95% probability to contain the true population mean of PCD or CDC for all graders and one example can be seen in Fig. 2, panel F. The areas of these confidence ellipses were calculated for comparison.The distances between the average locations across graders for each landmark were computed and the average spatial offsets within each retina assessed.

Results
When generating the PCD locations there were no instances of more than one pixel within an ROI having the same maximum or peak density value.The average PCD across all ROIs and all graders was 167,311 cones/mm 2 (range = 117,626-220,011 cones/mm 2 ).The intergrader ICC of the PCD values was 0.818 (95% CI = 0.744-0.891).The intergrader repeatability for PCD values was 27,784 cones/mm 2 or 16.61% (95% confidence interval: 26,737-28,832 cones/mm 2 or 15.98%−17.23%).This means the difference between two different measurements for the same image would be expected to be less than 16.61% for 95% of pairs of measurements.The measurement error was 19,660 cones/mm 2 , meaning, that the difference between the measured density and the true value will be less than 19,660 cones/mm 2 or 11.75% for 95% of observations.Using the variance components model, the largest contribution to the overall variance was from the image (82.2%, 95% CI = 77.9%−88.6%).The grader made a significant contribution to the overall variance (9.7%, 95% CI = 6.4%−14.3%),with the remaining variance being attributable to unspecified error sources (8.1%, 95% CI = 4.5%−8.7%).A one-way ANOVA revealed that there was a statistically significant difference in PCD between all but one pair of graders (F (df n = 2.292, df d =98.57) = 53.62,p < 0.0001).The average cone density at the CDC location was lower than the PCD by only 1.13% on average with a range of 0.076-6.57%.The average and range of graders estimates of cone density at PCD and CDC for all 44 images are shown in Fig. 3. Individual confidence ellipses for PCD and CDC for each image are shown in Fig. 4. The average confidence ellipse area for PCD location estimates was 1,231µm 2 with a range of 6-18,804 µm 2 .The same figures for the confidence ellipse areas for CDC location estimates were 80 µm 2 and 1-1,712 µm 2 respectively.The CDC location was significantly more reproducible than the PCD location (p < 0.0001, Wilcoxon matched-pairs signed rank test).The variability of PCD or CDC location was not statistically related to the density within the retina.Linear regression did not show a relationship between the PCD location confidence ellipse area and the average PCD estimate (r 2 = 0.012, p> 0.05).This was also true when comparing the CDC location confidence ellipse area and the average PCD estimate (r 2 = 0.010, p > 0.05).The average distance from the average PCD to the average CDC was 7.28 µm with a range of 0.89-19.92µm.This offset was not related to average PCD estimate (r 2 <0.0001, p > 0.05), or axial length (r 2 = 0.0523, p= 0.1355).This offset was displaced on average 0.29 µm nasally and 0.81 µm superiorly.The range of offsets by grader and the averages across five graders can be seen in Fig. 5.There was a weak relationship between the average area bounded by the 80 th percentile isodensity contour and PCD confidence ellipse area, meaning the steeper the slope of the cone density values moving away from the fovea, the smaller the disagreement between graders on the PCD location (r 2 = 0.4870, p < 0.0001) In contrast, there was no relationship between the average area bounded by the 80 th percentile isodensity contour and the CDC confidence ellipse area (r 2 = 0.0114, p > 0.05).

Discussion
This work demonstrates intergrader variability in PCD value estimates and shows that estimates of PCD location are associated with greater intergrader variability than those of the CDC landmark.These results are significant when we consider the importance of PCD value as a marker of overall cone packing and encourage averaging multiple graders estimates in future studies investigating the functional significance of the PCD.Significance of the demonstrated variability in PCD location is clear when considering the use of PCD location as a marker of the foveal center and a reference point for calculating retinal eccentricities.Currently, there is no standard approach to the definition of the foveal center in AO studies.The bottom of the foveal pit, the center of the foveal avascular zone [22], the point with the longest OS length and the location of PCD [23] are all potential surrogates for the center of the fovea.While histological studies can confidently identify the location of peak density and use it for definitive eccentricity calculation [24], some publications do not specify how they define the landmark [25][26][27][28].Some groups have used the preferred retinal locus of fixation [6], or positions relative to fixation [29].Choice of a center of mass anchor for calculation of eccentricities initially emerged due to unresolved cones at the foveal center [30][31][32].However, subsequent studies with resolution of the densest foveal cones supported the ongoing use of a center of mass approach, due to its higher intersession repeatability [19,33].The results of our study also confirm greater intergrader reproducibility in the CDC versus the PCD location, further supporting its use as an anchor for calculation of retinal eccentricities.While the bottom of the foveal pit or other OCT landmarks may be equally viable markers of the absolute center of the fovea AOSLO landmarks like the PCD and CDC have the distinct advantage of being identifiable within the same images, negating the need for alignment and scaling between imaging modalities and the attendant risk of error.
Variable features of the fovea may contribute to the applicability of these results to future investigations.Changes to the fovea seen in disease may impact reproducibility of these landmarks, as structural cone changes impair waveguiding and result in altered reflectance profiles including more dimly reflective cones.These changes have been shown to impact reproducibility of cone metrics in studies of achromatopsia [12,34,35], choroideremia [11], drusen and non-proliferative diabetic retinopathy [36] Stargardt disease [8], retinitis pigmentosa and other inherited degenerations [36][37][38].Intergrader reliability is known to vary significantly with pathology and the extent to which it affects the cone reflectivity profile.An example shown in Fig. 1 shows less agreement between graders on the presence and number of cones in dark spaces or locations where cone borders are indistinct and good agreement where cones are reflective with distinct borders, leading to good agreement on PCD location.Though the lower densities seen in pathologies associated with cell loss may be postulated to increase intergrader agreement, the peculiar appearance of cones to each disease and different stages of degeneration may make confident identification of the remaining cones less consistent between graders.Also, no association was found between agreement on PCD or CDC location and PCD.Even in conditions like albinism, that are static in nature, the same pattern of intergrader agreement may not be seen, due to reduced foveal cone packing, associated with broader regions of relatively increased density and an absence of the normal sharp increase toward the peak [39,40].
It is important to consider the impact of methodological differences when applying these results to future analyses.These results may be specific to the approach we used to produce pixel-by-pixel density estimates: a sliding window encompassing 150 bound cells in an expanding square around each pixel.There are alternate sampling window approaches -including use of a fixed size [15,16,41], use of a different shape (circular or elliptical rather than square) [16] or using a fixed number of nearest cells around an image pixel [19].Choice of sampling strategy dictates smoothness of the density map, which in turn may impact results.Likewise, use of non-confocal AOSLO imaging could provide images with more uniformly appearing cones, though the lateral resolution is lower than that afforded by confocal AOSLO.Thus it is unclear whether we would see similar CDC vs PCD offsets or similar intergrader reproducibility using non-confocal images.
While the CDC location is known to have superior intersession repeatability than the PCD location [19], the findings presented here highlight the superior intergrader reproducibility in CDC location relative to PCD location.These results support the use of the CDC as an anchor for calculation of eccentricity within AO images from individuals with contiguous cone mosaics even where resolution of the most densely packed cones is possible.Measurement error associated with PCD location is large enough to encourage reliance on projections derived from lower density regions of the fovea.Use of such an anchor would likely increase agreement between devices, modalities and centers, facilitating the construction of a robust multi-center database of eccentricity dependent cone density measurements in vivo.Such a database is integral to the studies linking morphology and function within the normal retina, but also to the success of multicenter clinical trials.

Fig. 2 .
Fig. 2. Comparing of foveal cone topography between graders.Panels A-E are pixel-by-pixel density maps representing entire 300 × 300 µm foveal ROIs.These maps are produced from the cone coordinates generated by each of our 5 graders for one image (that of JC_10312).Overlaid on the maps are 75 th −95 th percentile isodensity contours in 5 percentile increments with the 80 th percentile isodensity contour indicated by a white line.The location of peak cone density (PCD) is indicated by an open blue circle and the cone density centroid (CDC) is marked by a filled orange circle with a black outline.Panel F shows the confidence ellipses for the PCD and CDC location for this image, with individual PCD and CDC locations for the 5 graders marked according to panels A-E (axis units are in µm).I-inferior, S -Superior, N -Nasal, T -Temporal.

Fig. 3 .
Fig. 3. Variation in foveal cone density across images.Cone density estimates at PCD and CDC locations for each image ordered by increasing average PCD.Circles indicate the average of all five graders' estimates, error bars indicate the range of estimates for each image at each location.The small reduction in cone density estimates between the CDC and the PCD can be seen across images.

Fig. 4 .
Fig. 4. Interindividual variation in PCD and CDC reproducibility.PCD and CDC confidence ellipses for all 44 images arranged in ascending PCD confidence ellipse area order.Boundary squares represent the entire 300 × 300 µm ROI with tick marks indicating 50 µm distances.

Fig. 5 .
Fig. 5. Offset between PCD and CDC locations.Each point on the scatterplots represents the direction and offset of CDC from PCD for a single image.The bottom right most panel represents the averaged data across all graders.Inset in the top right of each individual panel is a cumulative frequency distribution of PCD to CDC offset, showing that for 90% of images the PCD-CDC offset is <39µm for grader 1, <29µm for grader 2, <13µm for grader 3, <26µm for grader 4 and <41 µm for grader 5.When averaged across graders the offset from PCD to CDC is <20 µm for 90% of images.