Statistical framework for validation without ground truth of choroidal thickness changes detection

Monitoring subtle choroidal thickness changes in the human eye delivers insight into the pathogenesis of various ocular diseases such as myopia and helps planning their treatment. However, a thorough evaluation of detection-performance is challenging as a ground truth for comparison is not available. Alternatively, an artificial ground truth can be generated by averaging the manual expert segmentations. This makes the ground truth very sensitive to ambiguities due to different interpretations by the experts. In order to circumvent this limitation, we present a novel validation approach that operates independently from a ground truth and is uniquely based on the common agreement between algorithm and experts. Utilizing an appropriate index, we compare the joint agreement of several raters with the algorithm and validate it against manual expert segmentation. To illustrate this, we conduct an observational study and evaluate the results obtained using our previously published registration-based method. In addition, we present an adapted state-of-the-art evaluation method, where a paired t-test is carried out after leaving out the results of one expert at the time. Automated and manual detection were performed on a dataset of 90 OCT 3D-volume stack pairs of healthy subjects between 8 and 18 years of age from Asian urban regions with a high prevalence of myopia.

S j=1 Ω j ⊂ R 3 of S slices. Utilizing the BM as a reference surface as aforementioned, the volume surrounding the CSI surface is subdivided into partially overlapping 3D cuboidal blocks C s i defined over a regular grid of dimension N = N ×S in the O xy plane, as illustrated in Fig. 2. Using a multiresolution approach, a 3D regularized block-matching registration of the CSI is conducted. As the images are aligned to the rigid BM, the displacement corresponds to shifts of the CSI. Thus, it is possible to determine the displacement field around the CSI and to use its outcome to quantify choroidal growth. Since this study focuses on quantitative choroidal thickness changes in longitudinal studies, only shifts in anterior-posterior/z-direction are considered.

Piecewise rigid registration
In CRAR the registration is conceived as a regularized minimization problem with the aim to find a set U = {u s i } of blockwise constant transformations u s i such that D is a distance measure that quantifies the similarity between reference I R and the transformed template image I T (p + u s i (p)). The regularizer R, with its corresponding trade-off parameter λ > 0, ensures certain properties of the transformation. In comparison to the previous version presented in [1], as of now called CRAR-v1.0, some improvements have been done leading to a better performance in the detection of temporal changes, as depicted in Fig. 4: as we analyze changes in the thickness of the choroid within time intervals of at least three months, the (inverse) normalized cross correlation (see Eq. (2)) is a more useful distance measure than the sum square difference (used in CRAR-v1.0) for such matching problems, where tomograms haven not been acquired at same average signal level. Thus, we define May 12, 2019 1/5 (1)

Reference stack Template stack
Reference B-scan The pairwise inter-stack rigid registration as initialization of CRAR (1), followed by an accurate segmentation of the BM (2). Utilizing the BM as reference, the area surrounding the CSI is subdivided into partially overlapping blocks (3). The displacement is represented by the shifts due to the blockwise transformation of the volume surrounding the lower boundary of the choroid.
where µ R and µ T are the average intensities of I R and I T respectively, whileĈ s i denotes the overlapping volume between reference and transformed template image and p is the point position in blockĈ s i ⊂ Ω.

Radial Differences Regularization
In order to adhere to the eye's natural shape, the regularization enforces the local homogeneity of the transformations in nasal-temporal (x-) and superior-inferior (y-) direction by penalizing their radial differences [4]. The mismatched blocks of the registration process are not individually corrected. Instead, the entire neighborhood is moved until the block configuration with the least bending energy is reached, see Fig. 2. Let N = N ×S be the total number of cuboids, p s i = (x s i , y s i , z s i ) and p t j = (x t j , y t j , z t j ) the centers of the blocks C s i and C t j , respectively. Then, the regularizer R is defined as follows: where u s i (p s i ) and u t j (p t j ) are the corresponding displacement vectors of p s i and p t j , as obtained from the 3D block-matching. Due to its smoothing properties and compact support, the radial cubic B-spline function K b : Ω × Ω → R has been chosen as kernel. It makes sure that in case of two blocks being wide apart, displacements influence each other much less than if they are within the same (2σ x × 2σ y )-neighborhood. The factor ||u s i (p s i ) − u t j (p t j )|| 2 of Eq.

B-spline deformation
Inspired by [5], a synthetically deformed OCT B-scan is created as follows: a target thinning rate α is generated as a random sample from a uniform distribution between [−25, 25] µm, with α < 0 denoting a thinning choroid and α > 0 corresponding to a thickening one. This range for α is chosen to realistically represent changes that can be observed in the choroid's thickness which are bigger than the daily variations up to 29 µm [6]. A 3D regular B-spline grid with L 1 × L 2 × L 3 control points is created, see Fig. 3. The region of interest to be manipulated is represented by two rows of grid nodes surrounding the CSI. The B-spline deformation is set to 0 outside this region. In order to attest the superior performance of CRAR in recognizing thickness changes, we apply CRAR on the 90 OCT volume stack pairs after artificially induced deformation [5]. The mean errors in detecting changes by CRAR are compared to those obtained applying the old version of the algorithm, a state-of-the-art segmentation method based  The average differences between the synthetically induced displacements and the measured ones, obtained with CRAR-v2.0 (red), CRAR-v1.0 (green), a state-of-the-art graph search based segmentation method [7] (purple) and manual expert segmentation (cyan), applied on the 90 OCT volume stack pairs after synthetic deformation for different resolution levels k. on graph search [7] and manual expert segmentation. As illustrated in Fig. 4, we observe that the higher precision of CRAR is remarkable in particular at a higher resolution level. As a reminder, we point out that a multiresolution approach for the registration is used in CRAR (for more detail see [1]). In this context, k denotes the resolution level with the corresponding number of patches in which the volume surrounding the CSI is subdivided in nasal-temporal/x-direction, e.g. 8, 16, . . . , 512 for k = 1, 2, . . . , 7. In Fig. 1 the situation for k = 1 and 8 blocks is shown.