Supervised Machine Learning Based Multi-Task Artificial Intelligence Classification of Retinopathies

Alam, Minhaj; Le, David; Lim, Jennifer I.; Chan, Robison V.P.; Yao, Xincheng

doi:10.3390/jcm8060872

Open AccessArticle

Supervised Machine Learning Based Multi-Task Artificial Intelligence Classification of Retinopathies

¹

Department of Bioengineering, University of Illinois at Chicago, Chicago, IL 60607, USA

²

Department of Ophthalmology and Visual Sciences, University of Illinois at Chicago, Chicago, IL 60612, USA

^*

Author to whom correspondence should be addressed.

J. Clin. Med. 2019, 8(6), 872; https://doi.org/10.3390/jcm8060872

Submission received: 24 May 2019 / Revised: 11 June 2019 / Accepted: 12 June 2019 / Published: 18 June 2019

(This article belongs to the Special Issue The Future of Artificial Intelligence in Clinical Medicine)

Download

Browse Figures

Versions Notes

Abstract

:

Artificial intelligence (AI) classification holds promise as a novel and affordable screening tool for clinical management of ocular diseases. Rural and underserved areas, which suffer from lack of access to experienced ophthalmologists may particularly benefit from this technology. Quantitative optical coherence tomography angiography (OCTA) imaging provides excellent capability to identify subtle vascular distortions, which are useful for classifying retinovascular diseases. However, application of AI for differentiation and classification of multiple eye diseases is not yet established. In this study, we demonstrate supervised machine learning based multi-task OCTA classification. We sought (1) to differentiate normal from diseased ocular conditions, (2) to differentiate different ocular disease conditions from each other, and (3) to stage the severity of each ocular condition. Quantitative OCTA features, including blood vessel tortuosity (BVT), blood vascular caliber (BVC), vessel perimeter index (VPI), blood vessel density (BVD), foveal avascular zone (FAZ) area (FAZ-A), and FAZ contour irregularity (FAZ-CI) were fully automatically extracted from the OCTA images. A stepwise backward elimination approach was employed to identify sensitive OCTA features and optimal-feature-combinations for the multi-task classification. For proof-of-concept demonstration, diabetic retinopathy (DR) and sickle cell retinopathy (SCR) were used to validate the supervised machine leaning classifier. The presented AI classification methodology is applicable and can be readily extended to other ocular diseases, holding promise to enable a mass-screening platform for clinical deployment and telemedicine.

Keywords:

ophthalmology; diabetic retinopathy; sickle cell retinopathy; quantitative analysis; computer aided diagnosis; artificial intelligence; support vector machine; optical coherence tomography angiography

1. Introduction

Machine learning based artificial intelligence (AI) technology has garnered increasing interest in medical applications over the past few years [1]. An AI-software platform is designed to mimic the perception of the human brain for information processing and making objective decisions. Recent studies have demonstrated AI applications in detecting retinal disease progression [2,3,4,5], identifying malignant or benign melanoma [6], and classifying pulmonary tuberculosis [7]. In ophthalmic research, application of AI technology has led to excellent diagnostic accuracy for several ocular conditions such as diabetic retinopathy (DR), age related macular degeneration (AMD), and sickle cell retinopathy (SCR) [2,4,8,9].

In the current clinical setting, mass screening programs for common ocular conditions such as DR or SCR are heavily dependent upon experienced physicians to examine and evaluate retinal images. This process is time consuming and expensive, making it difficult to scale up to incorporate the millions of individuals who harbor systematic diseases which are prone to affect the retina. Patients with early onset of retinopathies such as DR or SCR are initially asymptomatic yet require monitoring to ensure prompt medical interventions to prevent vision losses. However, it is not feasible to screen 65 million people in the USA over the age of 50 years [1] to identify for individuals with signs of early retinopathy (AMD, DR or other disease). An AI-based diagnostic tool with capability for multiple-disease differentiation would have tremendous potential to advance mass-level screening of eye diseases [10].

To date, most of the reported studies of AI diagnostic systems in literature are based on color fundus imaging [11,12,13,14]. Fundus imaging is one of the most common clinical imaging modalities and has been widely used in evaluating retinal abnormalities. Supervised and unsupervised machine learning based diagnostic systems using fundus images have been developed by researchers for staging of individual retinopathies as well as to identify multiple ocular diseases [8,15,16,17,18]. However, these demonstrated AI-based diagnostic tools generally face two major challenges. Firstly, fundus images provide limited resolution and retinal vascular information, limiting its capability to quantify subtle micro-vascular distortions near the foveal area and in different retinal layers. Thus, diagnostic systems using supervised machine learning algorithms suffer from low-performing quantitative feature analysis and concurrently low diagnostic accuracy. Secondly, systems using unsupervised or deep machine learning require a large and well documented database (ranging from 100,000 to millions) for training and optimizing convolutional neural networks. Even if an AI system is successfully trained, the intrinsic variance among different database from multiple imaging centers makes it extremely difficult to provide robust accuracy metrics. Additionally, in case of new retinal imaging modalities such as optical coherence tomography (OCT) angiography (OCTA), it is quite challenging to accumulate large, multi-center database for efficient clinical deployment of AI-based diagnostic tools.

As a potential solution to overcome these challenges, we propose a supervised machine learning based approach to train and evaluate a support vector machine (SVM) classifier model with quantitative OCTA features for multi-task AI classification of retinopathies. By providing excellent capability for depth-resolved visualization of retinal vascular plexuses, quantitative OCTA holds genuine promise for AI screening of retinopathies. Although the comparatively smaller data size of OCTA presently limits deep-learning based strategies, the sensitivity of OCTA features to detect onset and progression of retinopathies make it readily useful for supervised AI based screening. Recent studies have established several quantitative OCTA features correlated with subtle pathological and microvascular distortions in the retina. OCTA features such as blood vessel tortuosity (BVT), blood vascular caliber (BVC), vessel perimeter index (VPI), blood vessel density (BVD), foveal avascular zone (FAZ) area (FAZ-A), and FAZ contour irregularity (FAZ-CI) have also been validated for objective classification and staging of DR [5,19] and SCR [20], individually. Our recent studies demonstrated that DR and SCR show different effects on OCTA features, and thus quantitative OCTA analysis promises the potential of multiple-task classification to differentiate retinopathies and stages. In this study, we propose to test the feasibility of using these quantitative OCTA features for machine leaning based multi-task AI screening of different retinopathies. For easy comparison with our recent studies, DR and SCR were selected as the two diseases for technical validation of the proposed AI screening methodology. The AI system containing an SVM classifier model utilizes a hierarchical backward elimination technique to identify optimal-feature-combination for the best diagnostic accuracy and most efficient classification performance. The AI-based screening tool performs multi-layer hierarchical tasks to perform (1) normal vs. disease classification, (2) inter-disease classification (DR vs. SCR), and (3) staging of DR (mild, moderate and severe non-proliferative DR (NPDR)) and SCR (mild and severe). The performance of the AI system has been quantitatively validated with manually labeled ground truth, using sensitivity, specificity and accuracy metrics along with graphical metrics, i.e., receiver operation characteristics (ROC) curve.

2. Methods

Figure 1 illustrates the step by step methodology for the machine learning based multi-task AI classification. Each classification task involved primarily three steps. The first step was OCTA image data acquisition and feature extraction (DA and FE). The second step is optimal feature identification (OFI) using a hierarchical backward elimination technique for the specific classification task. The third step was to validate multiple-task classification (MTC) using the identified optimal-feature-combinations.

2.1. Data Acquisition and Feature Extraction

2.1.1. Data Acquisition

This cross-sectional study was approved by the Institutional Review Board (IRB) of the University of Illinois at Chicago (UIC) and complied with the ethical standards stated in the Declaration of Helsinki. Both the DR and SCR patients were recruited from UIC Retinal Clinic. All patients underwent complete anterior and dilated posterior segment examination (JIL, RVPC). For DR, a retrospective study of consecutive type II diabetes patients was conducted on those who underwent OCT/OCTA imaging. The patients are representative of a university population of diabetic patients who require imaging for management of diabetic macular edema and DR. Two board-certified retina specialists classified the patients based on the severity of DR (mild, moderate, severe NPDR) according to the Early Treatment Diabetic Retinopathy Study (ETDRS) staging system. In case of SCR, disease stages were graded according to the Goldberg classification (stage I-V, from mild to severe). Only stage II (mild) and III (severe) SCR data were included in this study as stage I OCTA data were limited in number while stage IV OCTA images were unreliable due to distortions caused by hemorrhages and vessel proliferation. For simplification in the classification process, we define the stage II and III as mild and severe stage SCR, respectively. The control OCTA data were obtained from healthy volunteers (no history of retinopathy) who gave informed consent for OCT/OCTA imaging. Both eyes (OD: right and OS: left) were examined and imaged. We did not include eyes with other ocular disease or any pathological features in their retina such as epiretinal membranes and macular edema. Additional exclusion criteria included eyes with prior history of vitreoretinal surgery, intravitreal injections or significant (greater than a typical blot hemorrhage) macular hemorrhages.

Spectral domain (SD) -OCT and OCTA image data were acquired using an Angiovue SD-OCT device (Optovue, Fremont, CA, USA), consisting of a 70,000 Hz A-scan rate, and axial and lateral resolutions of ∼5 μm and ~15 μm, respectively. All OCTA images used in this study were 6 mm × 6 mm scans; OCTA images were acquired from both superficial and deep capillary plexuses (SCP and DCP). All the images were quantitatively examined, and OCTA images with severe motion or shadow artifacts [21] were also excluded. The OCTA image quality was quantified with scan quality metric provided in the Angiovue’s software interface, ReVue. Any OCTA image with scan quality score less than 5 were excluded. The OCTA images were exported from imaging device and custom-developed MATLAB procedures were used for image processing, feature extraction and classification as described below.

2.1.2. Data Pre-processing and OCTA Feature Extraction

All the OCTA images used in this study had a field of view (FOV) of 6 mm × 6 mm (304 × 304 pixels). The OCTA images were normalized to a standard window level based on the maximum and minimum intensity values to account for light and contrast image variation. Bias field correction and contrast adjustment of the OCTA images improved the overall reliability of the extracted features and concurrently the performance of the classifier model to identify OCTAs from different cohorts.

Six different quantitative OCTA features were extracted from each OCTA image (Figure 2) for the AI classification. The vascular features were BVT, BVC, VPI, and BVD, while the foveal features were FAZ-A and FAZ-CI. Before measuring the vascular features, the vessel map and skeleton map were extracted from the OCTA image (i.e., Figure 2(A2,A3)). For the vessel map, we used a Hessian based multi-scale Frangi filter [22] to enhance vascular flow information. This method utilized the Eigen vectors of the Hessian matrices and calculated the likeliness of an OCTA region to be vascular structures. Adaptive thresholding along with morphological functions were furthers used for cleaning the vessel map and removing noise. From the vessel map, a skeleton map was generated using morphological shrinking functions. The extracted vessel and skeleton maps from OCTA images had an average area of 47.34% and 25.81% respectively.

A brief description of the feature measurement procedure is as follows:

BVT: The BVT was measured in the SCP. For BVT measurement, the BVT of each vessel branch is measured from the skeleton map and average BVT was measured as [23],

BVT = \frac{1}{n} \sum_{i = 1}^{n} (\frac{Geodesic distance of a vessel branch i}{Euclidean distance of a vessel branch i})

(1)

Euclidean distance = \sqrt{{(x_{1} - x_{2})}^{2} + {(y_{1} - y_{2})}^{2}}

(2)

Geodesic distance = \int_{t_{0}}^{t_{1}} \sqrt{{(\frac{dx (t)}{dt})}^{2} + {(\frac{dy (t)}{dt})}^{2}} dt

(3)

where [xi,yi] are the two endpoints of a vessel branch.

BVC: The BVC was measured from the SCP as the ratio of vascular area (calculated from vessel map) and vascular length (calculated from skeleton map) [23],

BVC = \frac{Vascular area}{Vascular length}

(4)

VPI: The VPI was measured from the perimeter map (i.e., Figure 2A4) in the SCP as the ratio of vessel perimeter area and total image area [23],

VPI = \frac{Perimeter area}{Total image area}

(5)

BVD: The BVD was measured in both the SCP and DCP using the fractal dimension (FD) technique. The details and rationale about FD calculation is previously described [23]. Each pixel is assigned an FD value from 0 to 1 where 0 corresponds to avascular region and 1 corresponds to large vessel pixels. The FD of 0.7 to 1corresponds to vessel pixels and average BVD was measured as the vascular area to total image area.

BVD = \frac{Vascular area}{Total image area}

(6)

The BVD measurements were taken in three localized regions in the retina, three circular regions of diameter 2 mm, 4 mm and 6 mm (C1, C2, and C3) around the fovea (i.e., as shown in Figure 2A5). The segmented FAZ area was excluded when measuring BVD for improved diagnostic accuracy.

FAZ-A: The FAZ-A was measured in both SCP and DCP. The fovea was demarcated automatically (i.e., blue area in Figure 2A3) and FAZ-A was measured as,

FAZ - A ({µ m}^{2}) = Number of pixels is Fovea \times Area of \sin gle pixel

(7)

The FAZ was measured using an active contour technique [23], where the seed point was automatically chosen as the center pixel of the OCTA image, since all the OCTA images were imaged as macula-centered scans. The automatically segmented FAZ area was compared to manually traced FAZ labelling and had 98.26% similarity with manually segmented ground truths.

FAZ-CI: The FAZ-CI was measured in both the SCP and DCP. From the demarcated fovea, FAZ contour was segmented automatically [23] (i.e., green demarcated contour in Figure 2A3). From the segmented contour the FAZ-CI was measured as,

FAZ - CI = \frac{Perimeter of the FAZ contour}{Perimeter of a circle with equivalant area to the FAZ}

(8)

2.2. Optimal Feature Identification

2.2.1. Statistics and Classification Model

Statistical analyses were conducted using MATLAB (Mathworks, Natick, MA, USA) and OriginPro (OriginLab Corporation, MA, USA). All the OCTA features were tested for normality using a Shapiro-Wilk test. For normally distributed variables, one-versus-one comparisons were conducted using Student’s t-test and one way, multi-label analysis of variance (ANOVA) was used to compare differences among multiple groups. If the features were not normally distributed, we used independent sample t-test (Mann-Whitney) for one versus one comparisons and non-parametric Kruskal-Wallis test for comparing multiple groups. A Chi-square test was used to compare the sex and hypertension distribution among different groups. For age distribution, we used ANOVA. Spearman’s correlation coefficients (r_s) were measured to analyze the relationship among the OCTA features and their correlation with DR or SCR severity. Statistical significance for univariate analysis and correlation test was defined with p < 0.05; however, the p values were Bonferroni-corrected for multiple simultaneous group comparisons. For the classification model that would be trained with OCTA features and perform the diagnosis prediction, we chose a support vector machine (SVM) classifier. In the case of logistic regression based backward elimination (Figure 1B), the initial critical value of p was 0.15 for the univariate model while it was 0.1 for multi-variate model. In this case, a p value of 0.05 or less was too conservative and there may have been a possibility of losing valuable information from multivariate regression analysis of different features.

2.2.2. Optimal Feature Selection with Backward Elimination

We implemented feature optimization to choose a subset of OCTA features that delivered the best diagnostic prediction for each classification tasks, i.e., (1) identifying disease patients from control, (2) inter-disease (DR vs. SCR) classification and (3) staging of DR (mild, moderate, and severe NPDR) and SCR (mild and severe) respectively. Taking inspiration from Occam’s Razor, we aimed to choose the smallest classification model that fit the data. For choosing this optimal feature combination for each classification task, we used a stepwise backward elimination technique. The flowchart of necessary steps taken in backward elimination of features is illustrated in Figure 1B. Backward elimination starts with all of the predictors in the model. The variable that was least significant that is, the one with the largest p value with worst prediction performance in a regression analysis was removed and the model is refitted. Each subsequent step removed the least significant variable in the model until all remaining variables have individual p values smaller than critical p value (set at 0.05). After the SVM was trained with the optimal feature combination, we tested the classification model with a testing data set. This feature selection process using backward elimination was repeated for each of the steps and the SVM model was trained with corresponding optimal feature combination at each step for a specific classification task. For control vs. disease and DR. vs. SCR classification, the SVM performed a binary (one vs one) classification while for staging disease conditions (mild vs. moderate vs. severe NPDR and mild vs. severe SCR) the SVM performed a multi-class classification. The prediction was performed on the testing database with 5-fold cross validation to control any overfitting. Once the SVM was trained with optimal feature combination, any new data could be directly inputted into the classifier to generate task-specific predictions.

2.2.3. Performance Metrics

The performance of the prediction model was evaluated with sensitivity, specificity, and accuracy metrics. Receiver Operation Characteristics (ROC) curves were also generated along with area under the ROC curve (AUC). The ROC curve plots the true positive rate (i.e., sensitivity) as a function of false positive rate (i.e., 1-specificity) at different tradeoff points. Then AUC was measured to quantify how well the classifier was able to identify the different classes. The closer the curve to the upper left corner, the more accuracy the prediction was. A value of AUC equal to 1 or 100% represented a perfect prediction, and 0.5 or 50% represented a bad prediction.

3. Results

The OCTA image database in this study included 115 images from 60 DR patients (20 mild, 20 medium and 20 severe NPDR), 90 images from 48 SCR patients (30 stage II mild and 18 stage III severe SCR), and 40 images from 20 control patients (representative images shown in Figure 2). Patient demographic data is shown in Table 1. There were no statistical significances in age and sex distribution between control, DR and SCR groups. (ANOVA, p = 0.14; chi-square test, p = 0.11 and p = 0.32, respectively). For DR, no significance in hypertension or insulin dependency between stages of disease groups was observed.

3.1. Optimal Feature Selection Using Backward Elimination

We employ a logistic regression-based model with backward elimination to select optimal combination of features for the multi-task classification. A summary of the quantitative univariate analysis of the OCTA features is shown in Table S1–S3 for comparing control vs. DR vs. SCR, NPDR stages and SCR stages respectively. In general, BVT, BVC and FAZ parameters increased with disease onset and progression whereas BVD and VPI decreased. The comparison of the diagnostic accuracy for each feature in the backward elimination process is shown in Table 2. Figure 3 provides further support to the results shown in Table 2, showing relative changes of OCTA features in different groups. Each panel corresponds to four classification tasks respectively. The backward elimination initially started with all OCTA features and eliminated features one by one based on the prediction accuracy of the fitted regression model. The feature selection method identified an optimal feature combination for each classification task, i.e., perifoveal BVD_SC3 (SCP, circular area: >4 mm), FAZ-A_S (SCP) and FAZ-CI_D (DCP) for control vs. disease classification; BVT_S (SCP), BVD_SC3, FAZ-A_S, and FAZ-CI_D for DR vs. SCR classification; BVD_SC3 and FAZ-A_S for NPDR staging; and BVT_S, BVD_SC3, and FAZ-CI_S (SCP) for SCR staging. From Table 2, we can observe that the individual accuracy of the optimal features in each classification task were highest compared to the other features and the model fitted with the combination of these optimal features provided the best diagnostic accuracy. Also, from Figure 3, we can see that the relative changes in each cohort could only be observed in the chosen optimal OCTA features.

3.2. Multi-Task Classification

The SVM classifier performed the classification tasks in a hierarchical manner. To evaluate the diagnostic performance in each step or task, we measured the sensitivity and specificity task. For each task, the ROC curves were also drawn (Figure 4) and AUCs were calculated. At the first step, the SVM identified diseased patients from control subjects with 97.84% sensitivity and 96.88% specificity (AUC 0.98). After identifying the diseased patients, the classifier sorted them to two groups: DR and SCR with 95.01% sensitivity and 92.25% specificity (AUC = 0.94). After sorting to corresponding retinopathies, the SVM conducted the condition staging classification: 92.18% sensitivity and 86.43% specificity for NPDR staging (mild vs. moderate vs. severe; AUC = 0.96), and 93.19% sensitivity and 91.60% specificity for SCR staging (mild vs. severe; AUC = 0.97). The sensitivity, specificity and AUC metrics were calculated for the SVM model trained with optimal feature combination. Table 3 shows the performance metrics in further details.

4. Discussion

We herein demonstrate the feasibility of a supervised machine learning based AI screening tool for multiple retinopathies using quantitative OCTA technology. In a hierarchical manner, this diagnostic tool can perform multiple tasks to classify (i) control vs. disease, (ii) DR vs. SCR, (iii) different stages of NPDR and SCR, using quantitative features extracted from OCTA images. These OCTA images can provide visualization of subtle microvascular structures in intraretinal layers which permits a comprehensive quantitative analysis of pathological changes due to systematic retinal diseases such as DR and SCR. Morphological distortions such as impaired capillary perfusion, vessel tortuosity and overall changes in foveal size and complexity etc. were quantitatively measured and compared for identifying onset and progression of DR or SCR in diabetes and SCD patients respectively. The SVM classifier model demonstrated a robust diagnostic performance in all classification tasks. The classification model also utilized a backward-elimination strategy for choosing an optimal combination of OCTA features for getting the best diagnostic performance with highest efficiency. Proper implementation of this AI-based tool in primary care centers would facilitate a quick and efficient way of screening and diagnosis of vision impairment due to systematic diseases.

For any screening and diagnostic prediction system, sensitivity is a patient safety criterion [24]. The AI-based tool’s major role is to identify patients prone to vision impairment due to retinopathies. In the control vs disease classification task, the 94.84% sensitivity of our system represents the capability to identify individual eyes with retinopathies (DR and SCR) from a general pool of control, DR and SCR eyes. Furthermore, the system can identify patients with DR or SCR with 95.01% sensitivity. This is crucial for screening purposes, as those patients should be referred to eye care specialists. Similarly, specificity is also an important factor because it will represent the capability of detecting subjects that do not require referral to an eye care specialist. When the data pool equals millions of patients, this discriminatory capability is crucial for efficient clinical effectiveness in mass-screening. Our system demonstrates 96.88% specificity which means the control subjects would rarely be erroneously referred for treatment of retinopathies; additionally, 92.25% specificity in DR vs. SCR classification means the patients with DR or SCR would not be referred with an incorrect diagnosis. This is relevant since certain advanced stages of a disease tend to progress faster than others and hence require more expedient evaluation and management upon referral. In mass-screening applications, the AI classification tool will be useful to identify proper referral for patients with systematic diseases (i.e., diabetes or SCD) and avoid unnecessary referral for patients who do not need specialized care at that time point.

Our study demonstrated that an optimal combination of OCTA features can achieve maximum diagnostic accuracy for all classification tasks. As supported by results from Table 2 and Figure 3, we can observe that, in all performance metrics, the classification model trained with optimal feature combination demonstrated better diagnostic proficiency compared to the model trained with individual features or combination of all features. The OCTA features analyzed in this study represent vascular and foveal distortions in retina due to retinopathy from both superficial and deep layers as well as localized circular regions in the retina (BVD). Out of all these OCTA features, the feature selection strategy identified the most sensitive features for each classification tasks to significantly distinguish different cohorts. The high diagnostic accuracy of the SVM classifier trained with optimal feature combination highlights the importance of the most relevant feature selection in automated classification. Few features that showed significance in the univariate analysis (Supplementary Tables) were not selected in the final set of optimal features. This suggests a contrast between clinical applicability and overall difference of OCTA features among different patient groups. Ashraf et al. [19] observed a similar phenomenon when using feature selection for automated staging of DR eyes. In all the classification tasks, the most sensitive features also had low correlation amongst themselves. Figure 5 illustrates a scatter plot showing correlation analysis for DR vs. SCR classification. We can observe that only FAZ parameters had positive correlation with each other; BVT and BVD both were not significantly correlated with FAZ parameters (Spearman’s rank test, p > 0.05), suggesting that all the features provided different pathological aspects of the diseased retina. Therefore, the four optimal features were objective for identifying DR or SCR associated distortions and their combination yielded strong classification performance.

The optimal OCTA features selected by the AI classification tool have been previously shown in the literature to be useful in quantitative analysis studies [25,26,27,28,29,30,31,32,33,34,35]. Both BVD and FAZ parameters (FAZ-A and FAZ-CI) have been shown to be significant in identifying DR stages [19,33,35,36]. Tortuosity metric, BVT is also an established predictor for SCR progression. In two separate studies, we previously demonstrated an SVM classifier for automated staging of DR groups (mild, moderate, severe) [5] and SCR groups (stage II and III) [20]. In our DR study, the most sensitive OCTA feature was observed to be BVD while for SCR, it was BVT and FAZ. These sensitive OCTA features are also selected to be included in the optimal feature set by the backward elimination technique in our current study for different classification tasks. Our current study, therefore, supports our previous findings and also demonstrates the clinical importance of identifying most sensitive features for different retinopathies. Furthermore, the optimal features included measurements from both SCP and DCP. Previous OCTA studies [19] including our recent studies [5,23] have suggested that the onset and progression of DR or SCR in diabetes or SCD patients affect both the retinal layers. By choosing optimized features from SCP and DCP, the AI-based model ensured representation of layer specific distortions due to retinopathies.

For practical implementation of any AI-based tool in mass-screening at a clinical setting, a major challenge is the computation time required for overall feature extraction, optimization and diagnostic prediction. Our AI-based screening tool required only 4–6 s to extract features from each OCTA image. From the training data, the optimized features were chosen using backward elimination which takes approximate 40–50 s (done only one time) depending on the size of the dataset. After the training of the SVM classifier is completed, it takes 8–10 s for classifying the testing database used in this study. If new data is included for diagnosis prediction, it takes only 1–2 s per OCTA image to use the trained model to classify control, DR or SCR eyes. However, at this point the AI-based tool is implemented in MATLAB (Mathworks, Natick, MA, USA), a separate software not integrated in the OCTA imaging device (Angiovue from Optovue, Fremont, CA, in our case). Once the technology is integrated into the interface of the OCTA device, the users can view real-time prediction as soon as the OCTA image is captured in retina clinics. The diagnostic accuracy can be enhanced even further if the patient history or clinical information is integrated into the screening tool.

Limitations of this study include relatively modest sample size for each of cohort and single imaging center. In future studies, we plan to include multiple imaging centers and a much larger OCTA database to test the robustness of our AI screening tool for practical implementation in retina clinics. Furthermore, we relied on the segmentation provided by the clinical device to identify the images from SCP and DCP. Thus, there is a possibility of segmentation error. The potential motion, projection artifacts in OCTA and error in reconstruction of OCTAs from SD-OCT volume data were few other limitations. However, we attempted to minimize the effect of these errors and artifacts in our study by excluding the images with severe artifacts, segmentation errors and patients with macular edema.

5. Conclusions

In conclusion, we present a supervised machine learning based multi-task AI classification tool that uses an optimal combination of quantitative OCTA features for objective classification of control, DR and SCR eyes with excellent diagnostic accuracy. Using the feature selection strategy, the classifier selected BVD_SC3, FAZ-A_S and FAZ-CI_D for control vs. disease classification; BVT_S, BVD_SC3, FAZ-A_S, and FAZ-CI_D for DR vs. SCR classification; BVD_SC3 and FAZ-A_S for staging of NPDR severity; and BVT_S, BVD_SC3, and FAZ-CI_S for staging of SCR severity. The optimal-feature-combination directly correlates to the most significant morphological changes in the retina for each classification task and provides the most effective classification performance with least computational complexity. Our diagnostic tool performs well with cross-validate data. However, further validation studies using larger cohorts of OCTA data from different centers and devices will facilitate future clinical implementation of a mass-level AI-based screening tool.

6. Patents

Pending patent application: X.Y., M.A., and J.I.L.

Supplementary Materials

The following are available online at https://www.mdpi.com/2077-0383/8/6/872/s1, Table S1: Univariate analysis of individual OCTA features for control, DR and SCR cohorts, Table S2: Univariate analysis of individual OCTA features for NPDR stages, Table S3: Univariate analysis of individual OCTA features for SCR stages.

Author Contributions

Conceptualization, X.Y. and J.I.L.; methodology, M.A. and X.Y.; software, M.A.; validation, X.Y. and J.I.L.; formal analysis, M.A. and D.L.; investigation, M.A.; resources, X.Y., J.I.L, and R.V.P.C.; data curation, M.A.; writing—original draft preparation, M.A.; writing—review and editing, X.Y, J.I.L. and R.V.P.C.; visualization, M.A.; supervision, X.Y.; project administration, X.Y.; funding acquisition, X.Y.

Funding

This research was supported in part by NIH grants R01 EY030101, R01 EY024628, P30 EY001792; by unrestricted grant from Research to Prevent Blindness; by Richard and Loan Hill endowment; by Marion H. Schenk Chair endowment.

Acknowledgments

The authors thank Mark Janowicz and Andrea Degillio (Eye and Ear Infirmary, University of Illinois at Chicago) for technical support of data acquisition.

Conflicts of Interest

Pending patent application: X. Yao, M. Alam, and J. I. Lim. No other competing interest for any other authors.

References

Ting, D.S.; Liu, Y.; Burlina, P.; Xu, X.; Bressler, N.M.; Wong, T.Y. AI for medical imaging goes deep. Nat. Med. 2018, 24, 539–540. [Google Scholar] [CrossRef] [PubMed]
Ting, D.S.W.; Cheung, C.Y.-L.; Lim, G.; Tan, G.S.W.; Quang, N.D.; Gan, A.; Hamzah, H.; Garcia-Franco, R.; San Yeo, I.Y.; Lee, S.Y.; et al. Development and validation of a deep learning system for diabetic retinopathy and related eye diseases using retinal images from multiethnic populations with diabetes. JAMA 2017, 318, 2211–2223. [Google Scholar] [CrossRef] [PubMed]
Kermany, D.S.; Goldbaum, M.; Cai, W.; Valentim, C.C.; Liang, H.; Baxter, S.L.; McKeown, A.; Yang, G.; Wu, X.; Yan, F. Identifying medical diagnoses and treatable diseases by image-based deep learning. Cell 2018, 172, 1122–1131.e9. [Google Scholar] [CrossRef] [PubMed]
Burlina, P.M.; Joshi, N.; Pekala, M.; Pacheco, K.D.; Freund, D.E.; Bressler, N.M. Automated grading of age-related macular degeneration from color fundus images using deep convolutional neural networks. JAMA Ophthalmol. 2017, 135, 1170–1176. [Google Scholar] [CrossRef] [PubMed]
Alam, M.; Zhang, Y.; Lim, J.; Chan, R.V.P.; Yang, M.; Yao, X. Quantitative Optical Coherence Tomography Angiography Features for Objective Classification and Staging of Diabetic Retinopathy. Retina 2018. [Google Scholar] [CrossRef] [PubMed]
Esteva, A.; Kuprel, B.; Novoa, R.A.; Ko, J.; Swetter, S.M.; Blau, H.M.; Thrun, S. Dermatologist-level classification of skin cancer with deep neural networks. Nature 2017, 542, 115–118. [Google Scholar] [CrossRef] [PubMed]
Lakhani, P.; Sundaram, B. Deep learning at chest radiography: Automated classification of pulmonary tuberculosis by using convolutional neural networks. Radiology 2017, 284, 574–582. [Google Scholar] [CrossRef]
Gulshan, V.; Peng, L.; Coram, M.; Stumpe, M.C.; Wu, D.; Narayanaswamy, A.; Venugopalan, S.; Widner, K.; Madams, T.; Cuadros, J. Development and validation of a deep learning algorithm for detection of diabetic retinopathy in retinal fundus photographs. JAMA 2016, 316, 2402–2410. [Google Scholar] [CrossRef]
Oliveira, C.M.; Cristóvão, L.M.; Ribeiro, M.L.; Abreu, J.R.F. Improved automated screening of diabetic retinopathy. Ophthalmologica 2011, 226, 191–197. [Google Scholar] [CrossRef]
Chew, E.Y.; Schachat, A.P. Should we add screening of age-related macular degeneration to current screening programs for diabetic retinopathy? Ophthalmology 2015, 122, 2155–2156. [Google Scholar] [CrossRef]
Goh, J.K.; Cheung, C.Y.; Sim, S.S.; Tan, P.C.; Tan, G.S.; Wong, T.Y. Retinal Imaging Techniques for Diabetic Retinopathy Screening. J. Diabetes Sci. Technol. 2016, 10, 282–294. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Mookiah, M.R.; Acharya, U.R.; Chua, C.K.; Lim, C.M.; Ng, E.Y.; Laude, A. Computer-aided diagnosis of diabetic retinopathy: A review. Comput. Biol. Med. 2013, 43, 2136–2155. [Google Scholar] [CrossRef] [PubMed]
Shah, S.A.A.; Laude, A.; Faye, I.; Tang, T.B. Automated microaneurysm detection in diabetic retinopathy using curvelet transform. J. Biomed. Opt. 2016, 21, 101404. [Google Scholar] [CrossRef] [PubMed]
Winder, R.J.; Morrow, P.J.; McRitchie, I.N.; Bailie, J.R.; Hart, P.M. Algorithms for digital image processing in diabetic retinopathy. Comput. Med. Imaging Graph. 2009, 33, 608–622. [Google Scholar] [CrossRef] [PubMed]
Gargeya, R.; Leng, T. Automated identification of diabetic retinopathy using deep learning. Ophthalmology 2017, 124, 962–969. [Google Scholar] [CrossRef] [PubMed]
Akram, M.U.; Tariq, A.; Anjum, M.A.; Javed, M.Y. Automated detection of exudates in colored retinal images for diagnosis of diabetic retinopathy. Appl. Opt. 2012, 51, 4858–4866. [Google Scholar] [CrossRef] [PubMed]
Franklin, S.W.; Rajan, S.E. Computerized screening of diabetic retinopathy employing blood vessel segmentation in retinal images. Biocybern. Biomed. Eng. 2014, 34, 117–124. [Google Scholar] [CrossRef]
Usher, D.; Dumskyj, M.; Himaga, M.; Williamson, T.H.; Nussey, S.; Boyce, J. Automated detection of diabetic retinopathy in digital retinal images: A tool for diabetic retinopathy screening. Diabetes Med. 2004, 21, 84–90. [Google Scholar] [CrossRef]
Ashraf, M.; Nesper, P.L.; Jampol, L.M.; Yu, F.; Fawzi, A.A. Statistical model of optical coherence tomography angiography parameters that correlate with severity of diabetic retinopathy. Investig. Ophthalmol. Vis. Sci. 2018, 59, 4292–4298. [Google Scholar] [CrossRef]
Alam, M.; Thapa, D.; Lim, J.I.; Cao, D.C.; Yao, X.C. Computer-aided classification of sickle cell retinopathy using quantitative features in optical coherence tomography angiography. Biomed. Opt. Express 2017, 8, 4206–4216. [Google Scholar] [CrossRef]
Spaide, R.F.; Fujimoto, J.G.; Waheed, N.K. Image artifacts in optical coherence angiography. Retina 2015, 35, 2463. [Google Scholar] [CrossRef] [PubMed]
Frangi, A.F.; Niessen, W.J.; Vincken, K.L.; Viergever, M.A. Multiscale vessel enhancement filtering. In Proceedings of the International Conference on Medical Image Computing and Computer-Assisted Intervention, Cambridge, MA, USA, 11–13 October 1998; pp. 130–137. [Google Scholar]
Alam, M.; Thapa, D.; Lim, J.I.; Cao, D.; Yao, X. Quantitative characteristics of sickle cell retinopathy in optical coherence tomography angiography. Biomed. Opt. Express 2017, 8, 1741–1753. [Google Scholar] [CrossRef] [PubMed] [Green Version]
Abràmoff, M.D.; Lavin, P.T.; Birch, M.; Shah, N.; Folk, J.C. Pivotal trial of an autonomous AI-based diagnostic system for detection of diabetic retinopathy in primary care offices. npj Digit. Med. 2018, 1, 39. [Google Scholar] [CrossRef]
Zahid, S.; Dolz-Marco, R.; Freund, K.B.; Balaratnasingam, C.; Dansingani, K.; Gilani, F.; Mehta, N.; Young, E.; Klifto, M.R.; Chae, B.; et al. Fractal Dimensional Analysis of Optical Coherence Tomography Angiography in Eyes with Diabetic Retinopathy. Investig. Ophthalmol. Vis. Sci. 2016, 57, 4940–4947. [Google Scholar] [CrossRef] [PubMed]
Lim, J.I. Ophthalmic manifestations of sickle cell disease: Update of the latest findings. Curr. Opin. Ophthalmol. 2012, 23, 533–536. [Google Scholar] [CrossRef] [PubMed]
Hoang, Q.V.; Chau, F.Y.; Shahidi, M.; Lim, J.I. Central macular splaying and outer retinal thinning in asymptomatic sickle cell patients by spectral-domain optical coherence tomography. Am. J. Ophthalmol. 2011, 151, 990–994.e1. [Google Scholar] [CrossRef] [PubMed]
Asdourian, G.K.; Nagpal, K.C.; Busse, B.; Goldbaum, M.; Patriankos, D.; Rabb, M.F.; Goldberg, M.F. Macular and perimacular vascular remodelling sickling haemoglobinopathies. Br. J. Ophthalmol. 1976, 60, 431–453. [Google Scholar] [CrossRef] [PubMed]
Minvielle, W.; Caillaux, V.; Cohen, S.Y.; Chasset, F.; Zambrowski, O.; Miere, A.; Souied, E.H. Macular Microangiopathy in Sickle Cell Disease Using Optical Coherence Tomography. Angiography. Am. J. Ophthalmol. 2016, 164, 137–144. [Google Scholar] [CrossRef] [PubMed]
Condon, P.I.; Serjeant, G.R. Ocular Findings in Homozygous Sickle-Cell Anemia in Jamaica. Am. J. Ophthalmol. 1972, 73, 533–543. [Google Scholar] [CrossRef]
Ishibazawa, A.; Nagaoka, T.; Takahashi, A.; Omae, T.; Tani, T.; Sogawa, K.; Yokota, H.; Yoshida, A. Optical Coherence Tomography Angiography in Diabetic Retinopathy: A Prospective Pilot Study. Am. J. Ophthalmol. 2015, 160, 35–44. [Google Scholar] [CrossRef] [PubMed]
Onishi, A.C.; Nesper, P.L.; Roberts, P.K.; Moharram, G.A.; Chai, H.; Liu, L.; Jampol, L.M.; Fawzi, A.A. Importance of considering the middle capillary plexus on OCT angiography in diabetic retinopathy. Investig. Ophthalmol. Vis. Sci. 2018, 59, 2167–2176. [Google Scholar] [CrossRef] [PubMed]
Samara, W.A.; Shahlaee, A.; Adam, M.K.; Khan, M.A.; Chiang, A.; Maguire, J.I.; Hsu, J.; Ho, A.C. Quantification of diabetic macular ischemia using optical coherence tomography angiography and its relationship with visual acuity. Ophthalmology 2017, 124, 235–244. [Google Scholar] [CrossRef] [PubMed]
Durbin, M.K.; An, L.; Shemonski, N.D.; Soares, M.; Santos, T.; Lopes, M.; Neves, C.; Cunha-Vaz, J. Quantification of retinal microvascular density in optical coherence tomographic angiography images in diabetic retinopathy. JAMA Ophthalmol. 2017, 135, 370–376. [Google Scholar] [CrossRef] [PubMed]
Bhanushali, D.; Anegondi, N.; Gadde, S.G.; Srinivasan, P.; Chidambara, L.; Yadav, N.K.; Roy, A.S. Linking retinal microvasculature features with severity of diabetic retinopathy using optical coherence tomography angiography. Investig. Ophthalmol. Vis. Sci. 2016, 57, OCT519–OCT525. [Google Scholar] [CrossRef] [PubMed]
Kim, A.Y.; Chu, Z.; Shahidzadeh, A.; Wang, R.K.; Puliafito, C.A.; Kashani, A.H. Quantifying microvascular density and morphology in diabetic retinopathy using spectral-domain optical coherence tomography angiography. Investig. Ophthalmol. Vis. Sci. 2016, 57, OCT362–OCT370. [Google Scholar] [CrossRef]

Figure 1. (A) Step by step methodology of artificial intelligence (AI) based classification. (B) Optimal feature selection with hierarchical backward elimination technique. DA and FE: data acquisition and feature extraction; OFI: optimal feature identification; MTC: Multiple-task classification.

Figure 2. Representative optical coherence tomography angiography (OCTA) images for illustrating the feature extraction. (A1–A5) Control subject, (B1–B5) mild non-proliferative diabetic retinopathy (NPDR) subject, (C1–C5) moderate NPDR subject, (D1–D5) severe NPDR subject, (E1–E5) mild sickle cell retinopathy (SCR) (stage II) subject, (F1–F5) severe SCR subject. Column 1: OCTA image. Column 2: Segmented blood vessel map including large blood vessels and small capillaries. Hessian based Frangi vesselness filter and fractal dimension (FD) classification provide a robust and accurate blood vessel map. Column 3: Skeletonized blood vessel map (red) with segmented foveal avascular zone (FAZ) (marked green region) and FAZ contour (yellow boundary marked around FAZ). Column 4: Vessel perimeter map. Column 5: Contour maps created with normalized values of local fractal dimension. Scale bar shown in A1 corresponds to 1.5 mm and applies to all the images.

Figure 3. Normalized feature trends for different cohorts. (A) Change in disease group (DR and SCR) compared to control. (B) Change in SCR compared to DR. (C) Change in moderate and severe NPDR compared to mild NPDR. (D) Change in severe SCR compared to mild SCR. Error bars represent standard deviation.

Figure 4. ROC curves illustrating classification performances of the prediction model using optimal combination of features. (A) Control vs disease classification. (B) DR vs. SCR classification. (C) NPDR staging. (D) SCR staging.

Figure 5. Correlation analysis among four most sensitive features. The scatter plot also shows the distribution of control, DR and SCR patient data for different feature combination.

Table 1. Demographics of control, DR and SCR subjects

	Control	DR			SCR
		Mild NPDR	Moderate NPDR	Severe NPDR	Mild SCR	Severe SCR
Number of subjects	20	20	20	20	30	18
Sex (male)	12	11	12	11	17	11
Age (mean ± SD)	42 ± 9.8	50.1 ± 12.61	50.8 ± 8.39	57.84 ± 10.37	51 ± 11.52	59.73 ± 8.26
Age range	25–71	24–74	32–68	41–73	28–71	46–75
Ethnicity	25% AA 20% Ca 45% A 10% SA	60% AA 20% Ca 15% A 5% SA	65% AA 20% Ca 15% A	60% AA 30% Ca 10% A	90% AA 5% Ca 5% A	90% AA 10% Ca
Duration of disease	-	19.64 ± 13.27	16.13 ± 10.58	23.40 ± 11.95	13.25 ± 8.78	18.43±10.7
Diabetes type	-	Type II	Type II	Type II	-	-
Insulin dependent(Y/N)	-	7/13	12/8	15/5	-	-
HbA1C %	-	6. 5 ± 0.6	7.3 ± 0.9	7.8 ± 1.3	-	-
HTN prevalence %	10	45	80	80	-	-

DR: diabetic retinopathy, SD: standard deviation, HbA1C: Glycated hemoglobin, HTN: hypertension, AA: African American, Ca: Caucasian, A: Asian: SA, South-Asian. ‘-’ defines ‘Not Applicable or Available’.

Table 2. Diagnostic accuracy measured during hierarchical backward elimination.

Parameters	Diagnostic Accuracy (%)
	Control vs. Disease	DR vs. SCR	NPDR Staging	SCR Staging
BVT_S	81.75	81.64	71.26	89.15
BVC_S	79.88	75.59	78.51	71.92
VPI_S	76.49	76.83	78.39	65.46
BVD_SC1	72.11	53.14	62.02	55.19
BVD_SC2	80.02	77.98	75.83	74.98
BVD_SC3	89.01	83.49	82.67	83.67
BVD_DC1	69.35	52.17	64.30	58.02
BVD_DC2	78.53	75.83	78.54	76.20
BVD_DC3	80.69	70.28	77.13	65.59
FAZ-A_S	91.67	83.66	85.02	78.84
FAZ-A_D	88.48	80.09	80.46	76.11
FAZ-CI_S	88.74	81.57	79.34	80.95
FAZ-CI_D	89.05	82.65	78.95	75.69
Optimal feature combination	97.45	94.32	89.60	93.11

Superscript S and D denote SCP and DCP respectively. In case of BVD, C1–C3 denote circular area 1,2 and 3 respectively as shown in Figure 2.

Table 3. Performance evaluation of multi-task classification algorithm using optimal feature combination.

Parameters	Classification Performance
	AUC	Sensitivity (%)	Specificity (%)
Control vs. Disease	0.98	97.84	96.88
DR vs. SCR	0.94	95.01	92.25
NPDR Staging	0.96	92.18	86.43
SCR Staging	0.97	93.19	91.60

AUC = area under the receiver operation characteristics (ROC) curve.

© 2019 by the authors. Licensee MDPI, Basel, Switzerland. This article is an open access article distributed under the terms and conditions of the Creative Commons Attribution (CC BY) license (http://creativecommons.org/licenses/by/4.0/).

Share and Cite

MDPI and ACS Style

Alam, M.; Le, D.; Lim, J.I.; Chan, R.V.P.; Yao, X. Supervised Machine Learning Based Multi-Task Artificial Intelligence Classification of Retinopathies. J. Clin. Med. 2019, 8, 872. https://doi.org/10.3390/jcm8060872

AMA Style

Alam M, Le D, Lim JI, Chan RVP, Yao X. Supervised Machine Learning Based Multi-Task Artificial Intelligence Classification of Retinopathies. Journal of Clinical Medicine. 2019; 8(6):872. https://doi.org/10.3390/jcm8060872

Chicago/Turabian Style

Alam, Minhaj, David Le, Jennifer I. Lim, Robison V.P. Chan, and Xincheng Yao. 2019. "Supervised Machine Learning Based Multi-Task Artificial Intelligence Classification of Retinopathies" Journal of Clinical Medicine 8, no. 6: 872. https://doi.org/10.3390/jcm8060872

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Menu

Supervised Machine Learning Based Multi-Task Artificial Intelligence Classification of Retinopathies

Abstract

1. Introduction

2. Methods

2.1. Data Acquisition and Feature Extraction

2.1.1. Data Acquisition

2.1.2. Data Pre-processing and OCTA Feature Extraction

2.2. Optimal Feature Identification

2.2.1. Statistics and Classification Model

2.2.2. Optimal Feature Selection with Backward Elimination

2.2.3. Performance Metrics

3. Results

3.1. Optimal Feature Selection Using Backward Elimination

3.2. Multi-Task Classification

4. Discussion

5. Conclusions

6. Patents

Supplementary Materials

Author Contributions

Funding

Acknowledgments

Conflicts of Interest

References

Share and Cite

Article Metrics

Article Access Statistics

Further Information

Guidelines

MDPI Initiatives

Follow MDPI