Fully automated analysis of OCT imaging of human kidneys for prediction of post-transplant function

: Current measures for assessing the viability of donor kidneys are lacking. Optical coherence tomography (OCT) can image subsurface tissue morphology to supplement current measures and potentially improve prediction of post-transplant function. OCT imaging was performed on donor kidneys before and immediately after implantation during 169 human kidney transplant surgeries. A system for automated image analysis was developed to measure structural parameters of the kidney’s proximal convoluted tubules (PCTs) visualized in the OCT images. The association of these structural parameters with post-transplant function was investigated. This study included kidneys from live and deceased donors. 88 deceased donor kidneys in this study were stored by static cold storage (SCS) and an additional 15 were preserved by hypothermic machine perfusion (HMP). A subset of both SCS and HMP deceased donor kidneys were classified as expanded criteria donor (ECD) kidneys, with elevated risk of poor post-transplant function. Post-transplant function was characterized as either immediate graft function (IGF) or delayed graft function (DGF). In ECD kidneys stored by SCS, increased PCT lumen diameter was found to predict DGF both prior to implantation and following reperfusion. In SCD kidneys preserved by HMP, reduced distance between adjacent lumen following reperfusion was found to predict DGF. Results suggest that OCT measurements may be useful for predicting post-transplant function in ECD kidneys and kidneys stored by HMP. OCT analysis of donor kidneys may aid in allocation of kidneys to expand the donor pool as well as help predict post-transplant function in transplanted kidneys to inform post-operative care.

performed annually but transplant centers still ultimately discard a large portion of ECD kidneys procured and offered for transplant [2][3][4]. The discard rate for ECD kidneys is nearly 45% compared to just over 10% for standard criteria donor (SCD) kidneys [5].
These discards represent a largely untapped source of potentially viable kidneys which, if properly utilized, could further widen the donor pool and narrow the gap between kidney supply and kidney demand. Studies have demonstrated that patients who receive moderately compromised kidneys live longer and have a higher quality of life than those who remain on dialysis and wait for a more viable option [6,7]. Currently there are approximately 17,000 kidney transplants a year in the United States. It is estimated that this number could be as high as 38,000 if more marginally compromised kidneys were considered and the donor pool properly utilized [8].
Surgeons reference a multitude of factors which contribute to their decision to reject a kidney. Principal among these are the results of biopsies, which are performed routinely on ECD kidneys, and are credited as the most frequent reason for discard. The true relevance of these factors is contested, with the majority appearing to have little correlation with graft function following transplant [9]. There is a critical need to enhance prognostic measures and to explore new ways of gaining insights into the viability of these more at-risk kidneys.
Optical Coherence Tomography (OCT) provides a non-invasive method for obtaining optical cross-sections of the superficial kidney cortex [10,11]. OCT is an interferometry based imaging modality, similar in principle to ultrasound, which uses the light scattering characteristics of tissue to construct high-resolution subsurface images. Cross-sectional 2dimensional OCT images (B-scans) are composed of a series of sequential 1-dimensional Ascans, reflectivity vs. depth profiles, which represent subsurface features in the sample [12][13][14]. These images reveal the microanatomy of the proximal convoluted tubules (PCTs), which comprise the majority of the superficial kidney cortex. Swelling of the epithelium of the PCTs is evident in OCT images and may be considered a symptom of ischemic insult [15,16]. Conversely, dilation of the tubular lumen is similarly evident in OCT and may be considered a symptom of pre-existing pathology or acute tubular injury (ATI) [17][18][19]. Quantification of the degree of swelling or prevalence of dilation may provide a valuable addition to current measures of kidney viability. This would contribute to more informed decision making and optimal usage of the kidney donor pool.
OCT also has potential utility following transplant, where a more accurate prediction of post-transplant function could influence post-operative care. Delayed Graft Function (DGF) is an established risk factor for survival of a transplanted kidney [20]. If DGF can be predicted immediately following transplant, early post-operative biopsies to investigate poor function can be avoided. Early diagnosis of DGF can similarly inform the development of immunosuppressive treatments, where evidence of potential DGF would provide incentive for a less nephrotoxic, Calcineurin-sparing regiment [21]. An accurate prediction of DGF would also promote the usage of any of a number of anti-DGF medications currently in development, should they be approved.
For OCT to be used effectively in a clinical setting, image analysis must be conducted quickly, reliably, and without bias. Automated segmentation achieves these goals and can provide rapid and accurate assessments of ischemic damage to the kidney [22,23]. In this paper, we present a fully automated system which can identify and segment the microanatomy of the human kidney in OCT images with accuracy comparable to manual segmentation. We demonstrate the correlation between quantitative measurements derived from this segmentation and graft function following transplant. obtained prior to enrollment. Patients eligible for this study included any kidney transplant recipient 18 years or older at the MedStar Georgetown Transplant Hospital.
Patient demographics were obtained at the time of consent. The patient pool was composed of approximately 60% male and 40% female recipients. Mean age at transplant was 52 with a standard deviation of ( ± ) 12.5 years. Mean BMI of recipients was 28.4 ± 4.7. 61% of patients in this study were African American, 24% were Caucasian, 8% were Hispanic, and 7% were Asian.

Transplant group categorizations
Of the 169 kidneys imaged and included in this study, 66 were from living-donor kidney transplants (LDKTs) and 103 were from deceased-donor kidney transplants (DDKTs). All LDKTs were preserved by static cold storage (SCS). Of the 103 DDKTs, 88 were preserved by SCS and 15 were preserved by hypothermic machine perfusion (HMP). 4 of the kidneys in the SCS group were part of a multi-organ transplant (kidney/pancreas). Of the 88 SCS kidneys, 26 had a KDPI (Kidney Donor Profile Index: score from 0 to 100 based on 10 donor factors, estimating the risk of graft failure) of 85 or more and were subcategorized as expanded criteria donor (ECD) kidneys. The remaining 62 SCS kidneys were subcategorized as standard criteria donor (SCD) kidneys [3]. Of the 15 kidneys in the HMP group, 2 kidneys qualified as ECD, and the remaining 13 were subcategorized as SCD kidneys (Fig. 1). Patients whose data were excluded from the analysis included those involved in parallel studies for anti-DGF clinical trials (1 patient) and those where image quality of the OCT image sets was compromised (2 patients).

Recovery group categorizations
Graft function following transplant was categorized as either immediate or delayed. Delayed graft function was designated when a transplant recipient was required to undergo dialysis within the first seven days following transplant [24] or when otherwise specified as DGF in clinical notes. All cases where transplant recipients did not require dialysis prior to discharge were considered immediate graft function (IGF). Recovery groupings for each transplant group were as follows: LDKT (65 IGF, 1 DGF), SCS SCD (51 IGF, 11 DGF), SCS ECD (18 IGF,8 DGF), HMP SCD (8 IGF, 5 DGF), and HMP ECD (1 IGF, 1 DGF) (Fig. 1). Fig. 1. Hierarchy classification of transplant groups with all transplants (blue tier 1) divided into live and deceased donor kidney transplants (blue tier 2). DDKTs are further divided into subgroups based on storage method (blue tier 3). DDKTs stored by SCS and HMP are further divided into subgroups based on risk of graft failure (blue tier 4). Each end-tier transplant group is divided into recovery groups based on requirement of dialysis (green and red).

Imaging protocol
Imaging in this study was performed with a 1325 nm center wavelength spectral-domain OCT imaging system (Telesto-II, Thorlabs Inc.), with an incident power of 2.5 mW. The Telesto OCT system was equipped with a 36 mm focal length (LSM03, Thorlabs Inc.) objective, providing a lateral resolution of 13 µm and an axial resolution of 5.5 µm in air. Scans were captured at a rate of 28 kHz, with a sensitivity of 103 dB. A-scans were averaged by 2 and no B-scan averaging was applied. B-scan settings were optimized to minimize file storage size while providing a sufficient field of view (FOV) and resolution for analysis. Parameters included a FOV of 4.9 mm in x-axis and 1.9 mm in z-axis (after adjusting for a refractive index of 1.3) at a scale of approximately 2.73 µm/pixel in each dimension (Fig. 2).
A technician in sterile surgical attire operated a handheld scanner, draped in a sterile sleeve with a layer of sterile Tegaderm transparent film dressing affixed to the focal spacer. Image sets were obtained ex-vivo immediately prior to implantation and again in-vivo immediately (13 ± 4 minutes) following reperfusion of the transplanted kidney.  ce of the usion of mages was on image and split manually segmented images were reassigned to different raters to produce measures of inter-rater variation (Fig. 4).
Manual segmentation was performed on 5 randomly selected images from each image set. Raters segmented the interface between the renal capsule and the cortex (upper red and blue lines in Fig. 4). Raters also segmented the full volume of quantifiable cortex (the area of cortex beneath the capsule where the signal appeared sufficient to discriminate anatomical features) (area between upper and lower red and blue lines in Fig. 4). Raters then segmented all regions which appeared to be cross-sections of PCT lumen, using the ImageJ "Versatile Wand" plugin [25] (red and blue selections in Fig. 4 with cyan indicating overlap). If a randomly selected image contained no quantifiable cortex, the image was skipped and the reason for exclusion was tallied as either "empty" with no contact between the probe and kidney (section 2.6.1), "high reflection" (section 2.6.2), or "high adipose" (section 2.6.3).

Automatic segmentation
Automatic segmentation was executed in MATLAB R2017b (Mathworks, Inc., Natick, MA, USA). To remove user bias and to improve feasibility of clinical application, automatic segmentation and analysis was performed on the original full 2D video image sets and not manually selected subsets of images. To expedite analysis and prevent error, it was necessary to remove images from processing which contained no quantifiable cortex. Features were extracted and compiled from images skipped and marked during manual analysis. These features were utilized to identify empty, high reflection, or high adipose images prior to performing more computationally expensive sections of the algorithm.

Empty B-scan detection
While a threshold of total intensity values would be an intuitive and high-speed approach to detection of empty B-scans, variations between empty images in background intensity, imaging artifacts and hyper-reflectivity of Tegaderm disallowed this strategy. Empty images were therefore identified by their average standard deviation in intensity values across the zaxis.
For each B-scan, the standard deviation of intensity values across each A-scan was taken and all A-scan standard deviations for that B-scan averaged. This process was repeated for all images marked during manual analysis as "empty" (Fig. 5(a)), and for all images which had cortex present and were manually segmented ( Fig. 5(b)). Comparison between these two groups demonstrated that a mean A-scan standard deviation of 47 or less correlated highly with images categorized as "empty" while a mean A-scan standard deviation above 47 correlated well with images which contained kidney (Fig. 5(c)). A standard deviation cutoff of 47 identified empty images with a sensitivity of 83.28% and a specificity of 98.91%.

Reflect
Bright vertic interfered wit to analyze (F was applied t 6(b)). Reflect intensity exce image ( Fig. 6 scans were e shadowing to 2.6.8).  ). However, heir swollen e sh strong-signa ging even for t wise not a con ed widely betw tion between si diminishing in 9(c)), while in oise ( Fig. 9(d)) ges with and edge strength, B-scan with visibl atomical landmark FOV moves past th little reduction to the cortex. rength were ge ith a contrast elatively weak nd remove the T More sensitive fy edges which Fig. 8(b)). er passed g. 10(b)).
Texture was 45x2.73 µm), no or little lum scan and we superficial co the capsule-co the weighting the map of lu which quantif 10(d)).   st map for aut l PCT lumen l thresholding arized the origi hreshold define ormed on a con with MATLAB h tile (Fig. 11(c d in Fig. 10(a)) fo esholding perform performed on the Fig. 11 Fig. 13(b)). T 13(c)) which m 3(c)). This im g a greater am on (red and blu ncluding circu n the 3D scan p c) respectively) plane in Fig.  entation  The B-scan cross-section features were fed as inputs into MATLAB's "Regression Learner App" with the percent reduction in area from the B-scan cross-section to the true cross-section as the response. A linear regression model was trained with 10-fold crossvalidation to predict the percent reduction in area required to transform an elongated or irregularly shaped cross-section into the area of the corresponding orthogonal cross-section. The model yielded a root-mean-square error (RMSE) of 0.15 and an R-Squared value of 0.69. The linear regression model was employed to correct the area of elongated and irregularly shaped cross-sections to the area of the corresponding true cross-sections. A notable limitation of this correction method, however, is that only one kidney was used for training of the model. In addition, this kidney was preserved in a formaldehyde solution and so may not accurately represent PCT morphology of a kidney used for transplant. Similarly, feature evaluation of the orthogonal cross-sections revealed that these sections were, on average, moderately elliptical (eccentricity of 0.67 ± 0.15); orthogonal cross-sections contained, on average, a minor axis to major axis length ratio of 3:4. Consequently, the linear regression model, depending on input features, may produce area estimations of non-circular orthogonal cross-sections. While orthogonal sectioning of tubules in kidneys preserved for transplant likely do not consistently produce perfectly circular lumen cross-sections due to anatomical heterogeneity and storage effects, it should be considered that the formaldehyde preservation of the kidney used in the linear regression model may have altered circularity of tubular lumen.

Diameter measurements
The diameter of lumen in PCT cross-sections was measured for all cross-sections in each Bscan. As the epithelium of the PCTs swells, the visible lumen should reduce. Conversely, as the epithelium is flattened or simplified, the visible lumen should increase. Diameter of the PCT lumen should therefore maintain an inverse relationship to the degree of swelling, and a direct relationship to the degree of epithelial flattening/simplification.
Diameter measurements are similarly impacted by the limitations of the 2D imaging protocol, with elongated non-orthogonal sections (red in Fig. 13(c)) potentially misrepresenting true lumen diameter. To circumvent this issue, diameter was defined as the "minor axis length" (shortest diameter which passes through the center of the ROI). This definition ensures that the elongated axis of tangential sections does not bias the diameter measurement. However, this may result in under-representation of the true diameter if the imaging plane does not cut through the tubular center axis. Consequently, an additional diameter measurement, derived from the corrected area, was used. This measure calculated diameter from the linear regression corrected area using the equation for calculating the area of a circle ( 2 A r π = ). To assess accuracy, a 50 µm capillary phantom was embedded in an agar solution which mimicked the scattering properties of kidney tissue. OCT scans were performed on the phantoms at three locations, and ROI maps were generated by the described method. Diameter of the interior of the capillary phantoms was calculated by the two methods described in this section and produced diameters of 45.7 ± 2.9 µm and 50.3 ± 3.1µm as measured by minor axis length and from corrected area respectively.

Inter-Lumen measurements
The minimum distance between edges of adjacent lumen was measured between all adjacent PCT lumen cross-sections in each B-scan (green in Fig. 14). Adjacency of ROIs was defined as when centroids were within 110 µm of each other (determined empirically as the maximum distance before tubule lumen outside of immediate adjacency were included) (red circle in Fig. 14). This inter-lumen distance was considered a measurement of the combined thickness of the epithelium of two adjacent PCTs and any interstitial space. As the epithelium swells, the inter-lumen distance should increase. Conversely, as the epithelium is flattened or simplified, the inter-lumen distance should reduce. Fig. 14. Depiction of methodology for inter-lumen and inter-centroid measurements. The red circle represents a 110 µm radius around the center ROI of "adjacent" ROIs. Distances between lumen edges and centroids are represented in green and blue respectively.

Inter-centroid measurements
The distance between centroids of adjacent PCT lumen was similarly measured between all adjacent PCT lumen cross-sections in each B-scan (blue in Fig. 14). This was considered a measurement of the combined lumen, epithelium, and interstitial space. The inter-centroid distance may be mostly unaffected by PCT swelling and epithelial flattening as changes to epithelial thickness and lumen diameter are inversely related and may balance. The intercentroid distance may therefor reflect changes to the interstitial space.

B-scan selection and measurement compilation
Measurements were compiled for each B-scan in each image set. As the 2D imaging protocol produced numerous duplicate or redundant images, only one B-scan was selected from each image set for analysis. As imaging protocol was to survey regions with the greatest area of visible tubule lumen (i.e. highest PCT lumen density), B-scan results were sorted by density and the maximum density B-scan was selected for inclusion in results. Measurements from these selected B-scans were averaged to yield values for pre-implantation and postreperfusion scans for each kidney. Results were averaged for each recovery group (IGF, DGF) in each transplant group (LDKT, SCS (SCD), SCS (ECD), HMP (SCD)) and represented in box and whisker plots.
In addition to analysis of correlation between measurements from selected B-scans and binary recovery group categories (IGF/DGF), the relationship between measurements and decline in patient's serum creatinine levels (which should decline rapidly and to a level <3.0 mg/dL if a transplanted kidney is well functioning) following transplant was investigated [27]. Linear mixed effects models were fitted to regress the longitudinal measures of serum creatinine from day 0 to day 5 on each patient to account for the within-subject variation by assuming an AR(1) (first order auto-regressive structure with homogenous variances) covariance structure and allowing for random intercepts for between-subject variation. The baseline creatinine measure, time, and interactions between time and each measurement were also included in the models. Models were fitted following our initial hypotheses that flattened PCT epithelium and dilated lumen would represent pathology, and consequently higher interlumen distance measurements, lower diameter measurements, and lower density measurements (which we initially predicted would echo diameter measurement trends) would correlate with a faster recovery (steeper decline in creatinine). Higher inter-centroid distances were hypothesized to represent pathology (as indicative of interstitial inflammation), and consequently lower inter-centroid distances would correlate with a faster recovery (steeper decline in serum creatinine).

Compari
Automatic seg images) had a (5.2 ± 3.7 pix Fig. 15 segmented cortex volumes compared to manually segmented cortex produced a Dice score of 0.84 ± 0.05. Comparison between manual raters' segmentations produced a Dice score of 0.81 ± 0.06. For selection of PCT lumen from the ROI map, a simple decision tree, with sensitivity and specificity comparable to more complex models, was selected to ensure robustness of the classifier. The classification tree was able to accurately select PCT lumen from the ROI map with a sensitivity of 85.58% and a specificity of 89.04%. Fig. 16. Table representing reproducibility measurements for manual raters (left) reassigned 25 B-scans each from their original sets. MAE, Dice coefficients, and Cohen's kappa coefficients are calculated for reproducibility in capsule-cortex interface, quantifiable cortex, and PCT lumen selections respectively. Kappa scores are also shown for only B-scans where density measurements were >5% (i.e. there was not a low population of tubule lumen). Comparison between manual raters' initial segmentations of the 25 reassigned images and automatic segmentation performed on those same images is also shown (right).
To assess reproducibility among manual raters, raters were reassigned 25 B-scans, randomly selected from B-scans which they had previously segmented. MAE was calculated, for segmentation of the capsule-cortex interface, between each rater's two segmentations for each B-scan, and ranged from 9 to 15 µm between raters (Fig. 16). Dice scores were similarly calculated between each rater's two segmentations of quantifiable kidney cortex and ranged from 0.77 to 0.9, with most raters achieving >0.8. Cohen's kappa coefficients were calculated between PCT lumen selections in both sets of segmented images and demonstrated fair to moderate agreement, with a range in scores between 0.38 and 0.6. Kappa coefficients improved dramatically to a range of scores between 0.55 to 0.72 when assessing only images with at least moderate (>5%) density.

Density by area results stratified by transplant group (IGF/DGF combined)
Distinctions between measurements from the ECD subgroup of DDKT kidneys stored by HMP and other transplant groups were not investigated due to limited sampling of ECD kidneys in the DDKT-HMP group (n = 2).
Prior to implantation (left in Fig. 17), kidneys from the LDKT transplant group demonstrated higher (p<0.001) PCT lumen density than DDKT kidneys stored by SCS. This difference may be considered a consequence of the markedly different transplant conditions, namely a considerably reduced ischemic time (mean of 1.47 ± 0.61 hours for LDKT versus 13.49 ± 7.06 hours for DDKT-SCS SCD and ECD subgroups). The SCD subgroup of DDKT kidneys stored by HMP had a higher (p<0.001) pre-implantation density than all other transplant groups. The high HMP density may be a result of artificial dilation of the PCT lumen by the machine-perfusion process. The LDKT group, and the DDKT-SCS SCD and ECD subgroups all experienced an increase in density between pre-implantation and postreperfusion scans. This is consistent with prior studies demonstrating a dramatic reduction in swelling of ischemic PCTs (which would present as an increase to total lumen area) following reperfusion [13,15]. In contrast to all other groups, the HMP group experienced a reduction in density following reperfusion, suggesting either some dissipation of the artificial dilation or induction of swelling. Post-reperfusion density (right in Fig. 17) was similar between LDKT and the DDKT-SCS SCD and ECD subgroups. Post-reperfusion density in the HMP group remained higher (p<0.05) than in both DDKT-SCS subgroups, and moderately higher than in

Density by area results stratified by recovery group (IGF vs. DGF)
Distinctions between IGF and DGF recovery group measurements in the LDKT transplant group were not investigated due to limited sampling of DGF kidneys (n = 1). Similarly, distinctions between IGF and DGF recovery group measurements in the ECD subgroup of DDKT kidneys stored by HMP were not investigated due to limited sampling (n = 1 for IGF, n = 1 for DGF). In all transplant groups, density values were similar between IGF and DGF recovery groups (green and red respectively in Fig. 17) prior to implantation. Following transplant and reperfusion, density measurements for the DDKT kidneys stored by SCS increased in both SCD and ECD subgroups for both IGF and DGF recovery groups. In the HMP group, the IGF recovery group experienced a <1% change in density while the DGF recovery group experienced a 23% reduction in density following reperfusion. In the SCD subgroup of DDKT kidneys stored by SCS, post-reperfusion density was similar between IGF and DGF recovery groups. In the ECD subgroup, however, post-reperfusion density in the IGF recovery group was lower (p<0.05) than that of the DGF group. Conversely, in the HMP group, post-reperfusion density in the IGF recovery group was higher (p = 0.28 for original density, p<0.05 for corrected) than in the DGF recovery group.

Density results by association with post-transplant creatinine decline
Following our initial hypothesis that lower PCT lumen density would correlate with a faster recovery following transplant (i.e. density is positively correlated with creatinine values and lower density is correlated with a steeper decline in creatinine (i.e., has a negative interaction effect with time)), linear mixed effect models were fitted for each DDKT transplant group. The pre-implantation fitted model for the SCS-SCD group did not support the hypothesis (p = 0.89), however the post-reperfusion SCS-SCD model trended towards support of the hypothesis moderately (p = 0.09). Both pre-implantation and post-reperfusion fitted models for the SCS-ECD group similarly did not support the hypothesis (p = 0.74, and p = 0.15 respectively). Finally, the pre-implantation model for the HMP-SCD group did support the hypothesis (p<0.01), as did the post-reperfusion model (p<0.001).

Diameter results stratified by transplant group (IGF/DGF combined)
Diameter measurements were relatively consistent between minor axis length and corrected area methods of measurement. Diameter calculated from corrected area was, however, moderately but consistently higher than diameter calculated as the minor axis length. This effect is likely due to the linear regression model's predictions of instances of moderately elliptical orthogonal cross-sections, which the minor axis length would underestimate.
Prior to implantation (left in Fig. 18), kidneys from the LDKT transplant group demonstrated moderately higher PCT lumen diameter than DDKT kidneys stored by SCS. DDKT kidneys stored by HMP had higher (p<0.001) pre-implantation diameter than all other transplant groups. All groups experienced an increase in diameter between pre-implantation and post-reperfusion scans. The LDKT and DDKT-HMP groups both experienced a modest 5% increase, while DDKT-SCS SCD and ECD subgroups both experienced a larger increase in diameter (18%, and 13% respectively). Post-reperfusion diameter (right in Fig. 18) was similar between the LDKT transplant group and the ECD subgroup of DDKT kidneys stored by SCS. Post-reperfusion diameter in the SCD subgroup of DDKT kidneys stored by SCS was moderately higher (p = 0.08) than in the ECD subgroup and the LDKT transplant group (p<0.05). Post-reperfusion diameter in the HMP group was higher than in all other groups (p<0.005, p = 0.08, p<0.005 for DDKT-SCS, LDKT, and DDKT-ECD respectively). Fig. 18. Box and whisker plots of diameter measurements calculated by minor axis length (a) and from lumen area corrected by linear regression (b) for pre-implantation (left) and postreperfusion (right) scans for the LDKT group, and the DDKT subgroups: SCD kidneys stored by SCS, ECD kidneys stored by SCS, and SCD kidneys stored by HMP. Each transplant group is further divided into recovery groups which experienced either IGF (green) or DGF (red) following transplant. Mean diameter values for each recovery group are included in the attached table with p-values (from Student's t-test) and values adjusted for FDR between transplant groups, representing significance of difference between recovery groups for each transplant group. The mean percent change (increase or decrease) to diameter following reperfusion is included at the bottom of each table for both recovery groups in each transplant group.

Diameter results stratified by recovery group (IGF vs. DGF)
In the SCD subgroup of DDKT kidneys stored by SCS, diameter measurements were similar between IGF and DGF recovery groups (green and red respectively in Fig. 18) prior to implantation. In the ECD subgroup of DDKT kidneys stored by SCS, pre-implantation diameter measurements were lower (p<0.05) in the IGF than in the DGF recovery group. In the SCD subgroup of DDKT kidneys stored by HMP, pre-implantation diameter measurements were similar between IGF and DGF recovery groups. Following reperfusion, diameter measurements for all recovery groups in all transplant groups increased. Within the SCD subgroup of DDKT kidneys stored by SCS and the HMP group, increases were similar between IGF and DGF recovery groups. In the ECD subgroup, diameter of the IGF recovery group increased 10% while diameter in the DGF group increased 17%. Post-reperfusion diameter in the SCD subgroup of kidneys stored by SCS was similar between IGF and DGF recovery groups. Within the ECD subgroup, diameter in the IGF recovery group remained lower (p<0.005) than in the DGF group. In the HMP transplant group, IGF diameter was moderately lower than in the DGF group (p = 0.34).

Diameter results by association with post-transplant creatinine decline
Following our initial hypothesis that lower PCT lumen diameter would correlate with a faster recovery following transplant (i.e. diameter is positively correlated with creatinine values and lower diameter is correlated with a steeper decline in creatinine (i.e., has a negative interaction effect with time)), linear mixed effect models were fitted for each DDKT transplant group. The pre-implantation fitted model for the SCS-SCD group did not support the hypothesis (p = 0.54), however the post-reperfusion SCS-SCD model did support the hypothesis (p<0.05). The pre-implantation fitted model for the SCS-ECD group similarly did not support the hypothesis (p = 0.96), and the post-reperfusion SCS-ECD model did support the hypothesis (p<0.05). Finally, the pre-implantation model for the HMP-SCD group did support the hypothesis (p<0.05), while the post-reperfusion model did not (p = 0.56).

Inter-centroid results stratified by transplant group (IGF/DGF combined)
Prior to implantation (left in Fig. 19), kidneys from the LDKT transplant group and DDKT kidneys stored by SCS (both SCD and ECD) all exhibited a similar inter-centroid distance. DDKT kidneys stored by HMP had a higher (p<0.05) pre-implantation inter-centroid distance than all other transplant groups. All groups experienced a modest 1-4% increase in intercentroid distance between pre-implantation and post-reperfusion scans. Post-reperfusion (right in Fig. 19) inter-centroid distance in the LDKT transplant group, and DDKT-SCS subgroups was similar. Post-reperfusion inter-centroid distance in the HMP group remained higher (p<0.005) than the LDKT group and moderately higher than the DDKT-SCS SCD and ECD subgroups (p = 0.14, and p = 0.06 respectively). Fig. 19. Box and whisker plots of inter-centroid measurements for pre-implantation (left) and post-reperfusion (right) scans for the LDKT group, and the DDKT subgroups: SCD kidneys stored by SCS, ECD kidneys stored by SCS, and SCD kidneys stored by HMP. Each transplant group is further divided into recovery groups which experienced either IGF (green) or DGF (red) following transplant. Mean inter-centroid distance values for each recovery group are included in the attached table with p-values (from Student's t-test) and values adjusted for FDR between transplant groups, representing significance of difference between recovery groups for each transplant group. The mean percent change (increase or decrease) to intercentroid distance following reperfusion is included at the bottom of each table for both recovery groups in each transplant group.

Inter-centroid results stratified by recovery group (IGF vs. DGF)
Prior to implantation, inter-centroid distance was similar between the IGF and DGF recovery groups in all transplant groups. Following reperfusion, inter-centroid distances increased in all transplant groups for both IGF and DGF recovery groups. In the SCD subgroup of DDKT kidneys stored by SCS, IGF and DGF recovery groups (green and red respectively in Fig. 19) experienced a similar increase following reperfusion. In the ECD subgroup of DDKT kidneys stored by SCS, and in the SCS subgroup of DDKT kidneys stored by HMP, the IGF recovery groups experienced a smaller increase in inter-centroid distance following reperfusion than the DGF groups. In the SCD subgroup of DDKT kidneys stored by SCS, post-reperfusion inter-centroid distance measurements were similar between IGF and DGF groups. In the ECD subgroup, inter-centroid distance was moderately lower (p = 0.09) in the IGF recovery group than in the DGF group. Post-reperfusion inter-centroid distance for the HMP group was lower (p<0.05) in the IGF recovery group than in the DGF group.

Inter-centroid results by association with post-transplant creatinine decline
Following our hypothesis that lower inter-centroid distance would correlate with a faster recovery following transplant (i.e. inter-centroid distance is positively correlated with creatinine values and lower inter-centroid distance is correlated with a steeper decline in creatinine (i.e., has a negative interaction effect with time)), linear mixed effect models were fitted for each DDKT transplant group. Both the pre-implantation and post-reperfusion fitted models for the SCS-SCD group did not support the hypothesis (p = 0.14, and p = 0.17 respectively). Both the pre-implantation and post-reperfusion fitted models for the SCS-ECD group did not support the hypothesis (p = 0.28, and p = 0.72 respectively). Finally, the preimplantation model for the HMP-SCD group did not support the hypothesis (p = 0.37), however the post-implantation model trended towards moderate support of the hypothesis (p = 0.07).

Inter-Lumen results stratified by transplant group (IGF/DGF combined)
Prior to implantation (left in Fig. 20), the LDKT group exhibited larger (p<0.05) inter-lumen distance than the SCD and ECD subgroups of DDKT kidneys stored by SCS. The SCD subgroup of DDKT kidneys stored by HMP exhibited an inter-lumen distance similar to the LDKT group. Following reperfusion, inter-lumen distance decreased slightly in the LDKT transplant group, the SCD subgroup of DDKT kidneys stored by SCS, and the SCD subgroup of DDKT kidneys stored by HMP. In the ECD subgroup of DDKT kidneys stored by SCS, inter-lumen distance increased slightly following reperfusion. Post-reperfusion (right in Fig.  20) inter-lumen distance was higher (p<0.05) in the LDKT transplant group than in the SCD subgroup of DDKT kidneys stored by SCS, and the SCD subgroup of DDKT kidneys stored by HMP. Fig. 20. Box and whisker plots of inter-lumen measurements for pre-implantation (left) and post-reperfusion (right) scans for the LDKT group (green), and the DDKT subgroups: SCD kidneys stored by SCS, ECD kidneys stored by SCS, and SCD kidneys stored by HMP. Each transplant group is further divided into recovery groups which experienced either IGF (green) or DGF (red) following transplant. Mean inter-lumen distance values for each recovery group are included in the attached table with p-values (from Student's t-test) and values adjusted for FDR between transplant groups, representing significance of difference between recovery groups for each transplant group. The percent change (increase or decrease) to inter-lumen distance following reperfusion is included at the bottom of each table for both recovery groups in each transplant group.

Inter-Lumen results stratified by recovery group (IGF vs. DGF)
Prior to implantation, inter-lumen distance was similar between the IGF and DGF recovery groups in all transplant groups. Following reperfusion, inter-lumen distances in all transplant groups decreased by less in the IGF recovery groups than in DGF groups (green and red respectively in Fig. 20). Post-reperfusion inter-lumen distance in the SCD subgroup of DDKT kidneys stored by SCS was similar between IGF and DGF recovery groups. In the ECD subgroup, post-reperfusion inter-lumen distance was moderately higher (p = 0.06) in the IGF recovery group than in the DGF group. In the HMP group, post-reperfusion inter-lumen distance was higher (p<0.05) in the IGF recovery group than in the DGF group.

Inter-Lumen results by association with post-transplant creatinine decline
Following our initial hypothesis that smaller inter-lumen distance would correlate with a faster recovery following transplant (i.e. inter-lumen distance is negatively correlated with creatinine values and higher inter-lumen distance is correlated with a steeper decline in creatinine (i.e., has a negative interaction effect with time)), linear mixed effect models were fitted for each DDKT transplant group. The pre-implantation fitted model for the SCS-SCD group did not support the hypothesis (p = 0.24), however the post-reperfusion SCS-SCD model showed strong support of the hypothesis (p<0.001). The pre-implantation model for the SCS-ECD group did not support the hypothesis (p = 0.78), however the post-reperfusion model did support the hypothesis (p<0.05). Finally, both the pre-implantation and postreperfusion models for the HMP-SCD group showed strong support for the hypothesis (p<0.0005, and p<0.005 respectively).

Parsimony of image measurements
To assess relevance and redundancy of measurements, the compiled measurements from each transplant group were included in the pool of candidate predictor variables in lasso penalized regression models, with the post-transplant function (IGF coded as 1 vs. DGF coded as 0) as the binary outcome variable. Two sets of penalized logistic regression models were run for each transplant group: one included pre-implantation measurements only in the candidate pool to identify the most relevant of pre-implantation measurements to post-transplant function (i.e., measurements which could affect allocation or discard), and the other included all the pre-implantation and post-reperfusion measurements in the pool to determine the most relevant measurements to post-transplant function (i.e., measurements which could affect post-operative care). The number of selected measurements was determined by minimizing the averaged 3-fold cross-validation error. Selected measurements and their impact are listed in Fig. 21. In the ECD subgroup of DDKT kidneys stored by SCS, the penalized model indicated pre-implantation diameter was most relevant, among pre-implantation measurements, to posttransplant function. Pre-implantation diameter had a negative impact on post-transplant function in this instance, suggesting that larger lumen diameter is the most predictive of assessed measurements for development of DGF in this transplant subgroup. When including both pre-implantation and post-reperfusion measurements, the regression model indicated post-reperfusion diameter and post-reperfusion density as the two variables, among all measurements, that were most relevant to post-transplant function. Both have negative impact on the outcome, suggesting that larger post-reperfusion lumen diameter and higher postreperfusion lumen density are the most predictive of assessed measurements for development of DGF in this transplant subgroup.
In the SCD subgroup of DDKT kidneys stored by SCS, the penalized model indicated preimplantation inter-centroid distance was most relevant, among pre-implantation measurements, to post-transplant function. Pre-implantation inter-centroid distance had a negative impact on post-transplant function in this instance, suggesting that larger intercentroid distance is the most predictive of assessed measurements for development of DGF in this transplant subgroup. When including both pre-implantation and post-reperfusion measurements, the regression model indicated pre-implantation inter-centroid distance and post-reperfusion density as the two variables, among all measurements, that were most relevant to post-transplant function. Inter-centroid distance and density had negative and positive impacts on outcome, respectively, suggesting that larger pre-implantation intercentroid distance and lower post-reperfusion lumen density are the most predictive of assessed measurements for development of DGF in this transplant subgroup.
In the SCD subgroup of DDKT kidneys stored by HMP, the penalized model indicated pre-implantation diameter was most relevant, among pre-implantation measurements, to posttransplant function. Pre-implantation diameter had a negative impact on post-transplant function in this instance, suggesting that larger diameter is the most predictive of assessed measurements for development of DGF in this transplant subgroup. When including both preimplantation and post-reperfusion measurements, the regression model indicated postreperfusion inter-lumen distance and post-reperfusion density as the two variables, among all measurements, that were most relevant to post-transplant function. Both have negative impact on the outcome, suggesting that smaller post-reperfusion inter-lumen distance and lower postreperfusion lumen density are the most predictive of assessed measurements for development of DGF in this transplant subgroup.

Discussion
Fibrosis in donor kidneys may compromise graft viability, and is routinely evaluated in preimplantation kidney biopsies [28][29][30]. Partial epithelial-to-mesenchymal transition (EMT) may play a role in the progression of fibrosis. This process has the effect of flattening PCT epithelial cells, and may produce an increased lumen diameter in affected tubules [31,32]. Similarly, fibrosis contributes to tubular atrophy, and in turn, compensatory hypertrophy of surviving PCTs [33,34]. The lumen of hypertrophied tubules is also frequently dilated to accommodate their increased role [35]. The effects of fibrosis therefore may be visible in OCT imaging, evidenced by the dilation of tubular lumen.
Acute tubular injury (ATI) in donor kidneys may similarly compromise graft viability. ATI can induce simplification of the tubular epithelium [18]. Shedding of the PCTs' microvillus brush border and sloughing of tubular epithelial cells into the lumen may also present as a dilation of the tubular lumen in OCT scans. In addition, as blood flow is restored following reperfusion, sloughed epithelial cells may obstruct flow and increase proximal tubule pressure dramatically; heightened pressure may produce substantial dilation of the tubular lumen presented in post-reperfusion OCT scans and potentially pre-implantation OCT scans of kidneys preserved by HMP [36]. The short-term effects of ATI therefor may be visible in OCT imaging, evidenced by the dilation of visible tubular lumen.
Swelling of the PCT epithelium, induced by ischemic damage, may similarly represent the effects or symptoms of ATI [18]. Epithelial swelling occludes the luminal space, resulting in a reduced diameter and an increased inter-lumen distance. If PCT swelling reduces the tubular lumen beyond the resolution of the OCT system, diameter and inter-lumen measurements would not reflect the contribution of more swollen PCTs. Density measurements, however, would illustrate this effect. In the SCD subgroup of DDKT kidneys stored by SCS, there were no strong differences in measurements between IGF and DGF recovery groups. In the ECD subgroup-those most at risk for poor post-transplant function, and most subject to discardmeasures of PCT lumen density and diameter, acquired both prior to implantation and following reperfusion, were lower in the IGF than in the DGF recovery group. The IGF recovery group similarly demonstrated a larger inter-lumen distance measurement following reperfusion than the DGF group. Taken together, these measurements suggest a flattening of the PCT epithelium and consequent dilation of tubular lumen in ECD kidneys which go on to experience DGF. This may be a symptom of pre-existing pathology (fibrosis) or ATI. It is unclear why this pattern does not present in the SCD subgroup.
Following reperfusion, density and diameter measurements in both the SCD and ECD subgroups of DDKT kidneys stored by SCS experienced increases in both IGF and DGF recovery groups. This may reflect dissipation of epithelial swelling as the kidney moves away from an ischemic state. This may also result from the effect of flow rate of filtrate on luminal diameter [37]. Increased distinction between IGF and DGF recovery group measurements following reperfusion may be due to pre-existing pathology being revealed by the dissipation of swelling (e.g. dilated lumen of hypertrophied tubules may become more evident when epithelial swelling subsides). More likely, this is the result of the reperfusion process inducing further shedding of the microvillus brush border and/or further epithelial sloughing. Similarly, sloughed tubular epithelial cells which may have fully occluded the lumen during staticstorage may be cleared following reperfusion, revealing further luminal dilation.
In the ECD subgroup, but not the SCD subgroup, of DDKT kidneys stored by SCS, the DGF recovery group experienced an increase in inter-centroid distance following reperfusion, while the IGF group did not. This may reflect infiltration of inflammatory cells into the interstitial space, and subsequent interstitial edema [38]. This would be consistent with the ATI theory and would suggest symptoms of ischemia/reperfusion injury (IRI) in the DGF group.
In the SCD subgroup of DDKT kidneys stored by HMP, diameter, and inter-lumen measurements for DGF kidneys echo the trends apparent in the ECD subgroup of DDKT kidneys stored by SCS (i.e. increased lumen diameter and reduced inter-lumen distance). This suggests that, in HMP preserved kidneys, ATI or pre-existing pathology may also present as dilated tubular lumen with simplified or flattened tubular epithelium. Inter-centroid measurements similarly echo trends apparent in the ECD transplant group. Following reperfusion, the DGF recovery group experienced an increase in inter-centroid distance and subsequently exhibited a higher inter-centroid measurement than the IGF recovery group. This again may suggest interstitial edema following reperfusion.
Surprisingly, HMP kidneys in the DGF recovery group experienced a dramatic reduction in density following reperfusion, while the IGF group experienced little change. The resulting IGF density was higher than the density in the DGF group. Higher diameter and lower interlumen distances in the post-reperfusion DGF group would normally correlate with higher density measurements. One explanation for this contradictory result is that some PCT lumens in the HMP-DGF group had become fully occluded following reperfusion, excluding these PCTs from diameter and inter-lumen measurement, but still detracting from luminal area in the density measurement.
One limitation of this study is the imaging protocol, which heavily weighted the composition of image sets towards regions of the kidney where tubule lumens were most visible and dilated. While this protocol may highlight focal points of pathology, it does not provide a global distribution of PCT features. Global imaging sampling multiple areas of the kidney may reveal a more heterogeneous pattern of swelling and dilation, with some areas exhibiting tubular lumen dilated by fibrosis or ATI, and other areas exhibiting significant swelling.
In future studies, a more systematic and global imaging strategy may yield further insights. While the selection of a single B-scan for each image set removes issues of redundancy, it also severely limits the total area being investigated. In future studies, a 3D imaging protocol would eliminate redundancy, allowing all imaging data to be evaluated and a larger volume of kidney to be assessed. Similarly, 3D imaging would enable orientation of tubular features in a 3D space and would provide more accurate measurements. While the linear regression model utilized in this study attempts to correct for this issue, training data for the model is extracted only from a single preserved kidney and may not be applicable to all kidneys.

Conclusion
There is a dire need in the transplant community for new measures of kidney viability. To support the growing need for kidneys, higher risk kidneys must be considered for transplant. To efficiently utilize this deeper end of the donor pool, surgeons must be able to confidently predict kidneys' potential function and longevity following transplant.
OCT provides a non-invasive view of the microanatomy of the superficial kidney cortex. Assessment of this anatomy has the potential to offer insights into the viability of a kidney offered for transplant. This study shows that dilation of tubular lumen and simplification of tubular epithelium of the PCTs can be assessed by OCT, and these measurements correlate with post-transplant function. These factors may represent symptoms of pre-existing pathology or acute tubular injury.
OCT analysis may provide a valuable supplement to current methods for assessing kidney viability. Accurate prediction of post-transplant function prior to implantation may aid in allocation of kidneys, while accurate prediction of post-transplant function following transplant may influence post-operative care.
The variability between manual raters in this study demonstrates the necessity of consistency and reproducibility in analysis. The fully automated analysis used in this study removes the elements of user bias and subjective segmentation. Similarly, manual segmentation is considerably too slow a process when advising a surgeon on the timesensitive decision to accept or reject a kidney for transplant. Fully automated segmentation and analysis provides a high-speed solution to obtaining accurate predictive measures.
This study assessed the potential utility of OCT imaging in predicting post-transplant function. While results are promising, inclusion of additional variables (KDPI, ischemic times, biopsy scoring, etc.) into one prediction model may provide a more comprehensive view of kidney viability. Similarly, global OCT imaging and capture of 3D volumes would provide a more detailed view of the distribution of PCT morphology, and may aid in prediction of post-transplant function. 3D volumes would similarly enable adoption of previously developed OCT segmentation strategies, for example the Hessian filter approach by Yousefi et al. and single-scattering model with segment-joining algorithm by Gong et al. [39,40].