Prediction of Cardiovascular Parameters With Supervised Machine Learning From Singapore “I” Vessel Assessment and OCT-Angiography: A Pilot Study

Purpose Assessment of cardiovascular risk is the keystone of prevention in cardiovascular disease. The objective of this pilot study was to estimate the cardiovascular risk score (American Hospital Association [AHA] risk score, Syntax risk, and SCORE risk score) with machine learning (ML) model based on retinal vascular quantitative parameters. Methods We proposed supervised ML algorithm to predict cardiovascular parameters in patients with cardiovascular diseases treated in Dijon University Hospital using quantitative retinal vascular characteristics measured with fundus photography and optical coherence tomography – angiography (OCT-A) scans (alone and combined). To describe retinal microvascular network, we used the Singapore “I” Vessel Assessment (SIVA), which extracts vessel parameters from fundus photography and quantitative OCT-A retinal metrics of superficial retinal capillary plexus. Results The retinal and cardiovascular data of 144 patients were included. This paper presented a high prediction rate of the cardiovascular risk score. By means of the Naïve Bayes algorithm and SIVA + OCT-A data, the AHA risk score was predicted with 81.25% accuracy, the SCORE risk with 75.64% accuracy, and the Syntax score with 96.53% of accuracy. Conclusions Performance of these algorithms demonstrated in this preliminary study that ML algorithms applied to quantitative retinal vascular parameters with SIVA software and OCT-A were able to predict cardiovascular scores with a robust rate. Quantitative retinal vascular biomarkers with the ML strategy might provide valuable data to implement predictive model for cardiovascular parameters. Translational Relevance Small data set of quantitative retinal vascular parameters with fundus and with OCT-A can be used with ML learning to predict cardiovascular parameters.


Introduction
Retinal vascular imaging is constantly improving. The retinal microvascular network can be thoroughly described with fundus photograph imaging analysis software, such as the Singapore "I" Vessel Assessment (SIVA). 1,2 In addition, the quantitative description of retinal microvascularization was recently enhanced by the development of a new noninvasive technique: optical coherence tomography angiography (OCT-A). 3 It has been shown that quantitative retinal vascular characteristics in both fundus photographs with SIVA software and metrics of retinal vascular density with OCT-A are associated with patients' systemic vascular alteration, 4 cardiovascular risk profile 5,6 and cardiovascular complications. 7 In addition, deep learning algorithms have been described for diabetic retinopathy detection, retinopathy of prematurity screening, grading of age-related macular degeneration, and glaucoma detection. [8][9][10][11] Furthermore, Poplin et al. recently indicated that retinal imaging analysis of fundus photography using deep learning could be used to predict a wide range of cardiovascular risk factors. 12 However, deep learning is limited in some healthcare applications, particularly in a context of sparce data and real world clinical data. 13 Thus, machine learning (ML) methods could be trained more easily and provide better overall performance when compared to deep learning with a small data set. We hypothesized that classic ML algorithms could predict cardiovascular risk scores and systemic parameters from quantitative retinal vascular data obtained with SIVA software and OCT-A. Better cardiovascular risk stratification is consequently of growing interest given that cardiovascular disease (CVD) remains one of the leading causes of death worldwide. 14 The purpose of this pilot study was to develop a prediction model with a supervised ML approach to cardiovascular parameters using retinal vascular characteristics measured with SIVA software and OCT-A (alone and combined). Our objective was to estimate the cardiovascular risk score (American Hospital Association [AHA] risk score, Syntax score, and SCORE risk) with this ML model.

Study Design and Patients
This pilot study was an ancillary study of a previous pilot prospective cross-sectional study conducted in Dijon University Hospital's Cardiology Intensive Care Unit. The methodology of the EYE-MI study and the patients' baseline characteristics have been detailed elsewhere. 5 Briefly, from May 2016 to May 2017, patients presenting with acute coronary syndrome (ACS) were included. They were taken to the ophthalmology department within the first 2 days of their hospitalization for an examination of the retinal microvasculature using OCT-A and fundus photographs. The exclusion criteria were: retinal disease (vascular occlusion, diabetic retinopathy, and macular degeneration), patients under 18 years of age, those under guardianship, patients with hemodynamic instability, and patients without both retinal examinations (SIVA and OCT-A). We also excluded severely myopic eyes (axial length greater than 26 mm) because this could affect retinal microvascular density. 15 The study was approved by the Dijon University Hospital ethics committee and was registered as 2017-A02095-48. It complied with the tenets of the Declaration of Helsinki and a written informed consent was obtained from the patients. We followed the Strengthening the Reporting of Observational Studies in Epidemiology (STROBE) statement according to the EQUATOR Guidelines. 16

Retinal Microvascular Image Acquisition and Quantitative Analysis
After inclusion, the patients underwent an OCT-A examination (CIRRUS HD-OCT, Model 5000; Carl Zeiss Meditec AG) and 45 degree color retinal photographs, centered on the optic disc, were obtained with a fundus camera (TRC NW6S, Topcon, Tokyo, Japan) for both eyes. This eye examination was performed under mydriasis obtained with eye drops containing tropicamide 0.5% (Thea, Clermont-Ferrand, France). Axial length was measured using an optical biometer (IOL Master; Carl Zeiss Meditec AG, Jena, Germany). Quantitative OCT-A metrics were obtained with the angiography software (Angioplex, version 10; Carl Zeiss Meditec AG; Fig. 1A). We investigated the retinal vascular features in the superficial capillary plexus (SCP) and the following measurements were taken in the 3 × 3 mm angiograms: the area of the foveal avascular zone (FAZ; mm 2 ), perfusion density (area, unitless), and vessel density (length, mm −1 ). These densities were both measured in the inner and full Early Treatment Diabetic Retinopathy Study (EDTRS) circle.
In addition, fundus photographs were anonymously sent to the reading center in Yamagata University, Japan (author R.K.), and a single trained grader extracted retinal vessel characteristics with the SIVA software (Fig. 1B). The computerized analysis of the retinal vascular network computerized analysis was based on the analysis of vessels from the center of the optic disc and then to three successive zones corresponding to 0.5 (zone A), 1 (zone B), and 2 (zone C) disc diameter. The six largest arterioles and veins were analyzed.
Thus, only one eye (with both OCT-A and SIVA examinations) was retained for analysis for each participant and its selection followed the criteria described below: (1) fundus photograph and OCT-A of the right eye for participants born in even-numbered years and the left eye for those born in odd-numbered years; (2) in single-eye patients, the functional eye was selected; and (3) when SIVA or OCT-A were uninterpretable on one eye, the other one was retained for analysis. Only OCT-A images with a signal strength >7/10 were retained. For this ancillary study we kept 144 eyes with both OCT-A and SIVA examinations.

Cardiovascular Data Collection
As described previously, we extracted cardiovascular data from medical records and observation sheets used by the Observatoire des Infarctus de Côte d'Or (RICO). 5 The following data were collected: age, sex, high blood pressure, diabetes mellitus history, body mass index (BMI), hypercholesterolemia, and current smoking. From the above data, cardiovascular risk scores defined by the AHA (AHA risk score) for a high-risk population were calculated. The AHA risk score includes age, sex, the ethnic origin, the history of arterial hypertension and diabetes, active smoking, systolic, and diastolic arterial pressure and levels of total cholesterol and high-density lipoprotein (HDL) cholesterol levels. 17 The anatomic Syntax score, a risk stratification score for coronary lesions (length, bifurcation, diffuse disease, calcifications, thrombus, and total occlusion) was determined for all of the patients who underwent coronarography. 18 We finally calculated the SCORE risk. 19

Pre-Processing of OCT-A and SIVA Data
The dispersion of the value of the different parameter is significant from one patient to another and from one parameter to another. The scale of values also differs. To provide a common referential for all data, these were normalized between 0 and 1. The SIVA and OCT-A settings used in this study are listed in Figures 1 and 2. Figure 2A and Figure 3A illustrate this variability and dispersion of values. Figure 2B and Figure 3B show the normalized data.

Machine Learning Approach
The method used can be summarized by the following conceptual framework: 1. To build a prediction model of the cardiovascular risk score and parameters based on supervised machine learning approach. The process applied a set of known input (OCT-A and/or SIVA data) with response data (cardiovascular risk score and parameters) and created a model to produce reasonable predictions of these cardiovascular parameters. In this study, supervised learning based on classification (K-nearest  neighbors, discriminant analysis, and Naïve Bayes) and regression (decision trees) techniques were used in order to develop these predictive models. 2. To use the predictive model obtained in the previous step to estimate the cardiovascular parameters of patients depending on OCT-A and SIVA retinal vascular features.
Further details on the methodology were provided in the Supplementary Material. The cardiovascular parameters were divided into two groups. The first group consisted of the cardiovascular risk score: AHA Risk score, Syntax score, and SCORE risk. Regarding the AHA Risk score in this study, it has not been used as a primary prevention risk assessment as it was calculated in a high-risk profile population. We chose this score to attest the systemic vascular profile of our population because no therapeutic decision (aspirin, statins, and antihypertensive drugs) was made with regard to this calculation. This group accounted for the primary prediction goal. The second one included age, sex, high blood pressure history, diabetes mellitus history, hypercholesterolemia, current smoking status, and BMI.
In this study, each cardiovascular parameter was categorized using two or three labels (Score risk ≥ 5%, AHA risk ≥ 20%, Score syntax ≥ 33, age > 60 years, and BMI > 25 and > 30). To avoid bias in the choice of the learning group patients, patients were selected randomly. In our algorithm, the minimum number of patients required for the learning group was not defined. The learning group increased in increments of 10 patients. For each increment, the predictive model was applied to the database in order to investigate the correlation between the prediction rate and the size of the learning group. All programs were written using Matlab software. These programs can be shared on request to the authors. For each data set, the prediction rates of the four ML algorithms were compared. After that, the prediction rates of the same algorithm were compared according to the three data sets (OCT versus SIVA versus OCT + SIVA). Paired tests with post hoc correction using Tukey pair wise multiple comparisons were used. 20 This analysis was done using the mixed procedure of SAS statistical software (version 9.4; SAS Institute, Inc., Cary, NC, USA).

Results
In this study, the data of 144 patients were taken into account. Baseline characteristics of the study participants are presented in Table 1. The mean age was 61.9 (±12.6) years old and 20.1% were female patients. There was no difference between participants and nonparticipants from the EYE-MI study. 5 The prediction rates of cardiovascular parameters based on the K-nearest neighbor (KNN) approach and the Naïve Bayes approach are presented in Supplementary  Figures S1 and S2. These figures show the value of the prediction rate depending on the number of learning patients. Figures 4,5, and 6 displayed the comparison of the prediction results from the four ML algorithms for the cardiovascular risk score.
As shown in Figures 4, 5, and 6, the prediction rate was dependent on the number of learning patients used to build the prediction model. The prediction rates of the four ML techniques are reported in Table 2 from OCT-A data, Table 3 from SIVA data, and Table 4 from OCT-A + SIVA data. There were significant differences between prediction rates obtained from the discriminant analysis, the KNN method, the Naïve Bayes approach, and the decision tree classification The results are displayed as n (%) for categorical variables and as mean and standard deviation M (±SD) for continuous variables.
CHD, Cardiovascular and heart disease; LVEF, Left ventricular ejection fraction; NSTEMI, Non ST-Elevation Myocardial Infarction; STEMI, ST-Elevation Myocardial Infarction. P value for comparison between participants and non-participants.   with the Type 3 Tests of Fixed Effects. When using multiple test adjustment, two algorithms (Naïve Bayes and KNN) were associated with higher prediction rates compared to the discriminant analysis and the decision tree classification (Supplementary Table).
The ranges of accuracy were for KNN and OCT-A + SIVA data (0.25-0.97), OCT-A data (0.31-0.98), and SIVA data (0.24-0.98). For the Naïve Bayes approach the range of accuracy were OCT-A + SIVA data (0.24-0.98), OCT-A data (0.29-0.98), and SIVA data The prediction rates (%) are displayed as mean and standard deviation (M ± SD). The prediction rates (%) are displayed as mean and standard deviation (M ± SD).
(0.39-0.97). For both strategies, sensitivity and specificity ranged from 0 to 1. The KNN and the Naïve Bayes approaches more accurately predicted the three cardiovascular risk scores compared to the discriminant analysis and the decision tree approaches. Finally, we compared the prediction rate with SIVA, OCT-A, and OCT-A + SIVA data for these two algorithms (Naïve Bayes The prediction rates (%) are displayed as mean and standard deviation (M ± SD). OCT-A, optical coherence tomography angiography; SIVA, the Singapore "I" Vessel Assessment.
and KNN) and each cardiovascular parameters (Table 5).

Discussion
In this cross-sectional pilot study, we applied a prediction model of cardiovascular parameters from ML algorithms using retinal vascular characteristics measured with SIVA software and OCT-A (alone and combined). Overall, we observed that this approach may be effective to procure a moderate to robust prediction rate for a specific cardiovascular data set.
To the best of our knowledge, this is the first study to explore the potential interest of using quantitative parameters of the retinal microvascular network measured by means of SIVA software and OCT-A to predict the cardiovascular risk factors burden of high cardiovascular risk profile patients with a non-deep learning model. In this study, we focused on patients with a history of cardiovascular disease (inclusion criteria: ACS). OCT-A and retinal vascular analysis based on fundus photographs with software, such as SIVA, provide us with a large amount of quantitative data. Supervised classifiers help us to properly take advantage of all these quantitative data. Although deep learning based on convolutional neural networks in ophthalmology is commonly used to detect glaucoma, aged-related macular degeneration and diabetic retinopathy, 21-24 the sparsity of OCT-A and SIVA data added to our relatively small sample size make it difficult to apply this type of deep learning algorithm in this study. 25,26 In this regard, our approach of cardiovascular prediction with retinal vascular biomarker is novel and helpful.
Publications concerning the prediction of the cardiovascular risk profile with ML algorithm are steadily increasing in number. 27,28 Alaa et al. recently investigated the cardiovascular disease risk prediction with ML methods. 29 They conducted a large prospective cohort study and analyzed data on 423,604 participants without cardiovascular disease at baseline in UK Biobank. They used more than 470 variables, notably on health and medical history, dietary and nutritional information, and sociodemographics. They found that their ML algorithm significantly improved the accuracy of cardiovascular disease risk prediction compared to gold standard scoring systems based on conventional risk factors (Framingham score). 30 Comparison with 12 The UK Biobank provides a tremendous amount of information in a very large population, making it possible to predict cardiovascular parameters and more recently hemoglobin concentration. 31 Moreover, Gerrits et al. recently published a deep learning algorithm to predict cardiometabolic risk factors based on 12,000 retinal images from 3000 participants. 32 Recently, Cheung et al. suggested creating a "retinal vessel CVD risk score" based on the artificial intelligence system. 33 This deep learning project will have to overcome differences in terms of imaging device, image quality, and protocol standardization. Our approach is more pragmatic and could be reserved for smaller centers with a limited data set. Furthermore, in our algorithms, we were able to include different kinds of retinal image parameters: SIVA and OCT-A combined and not only retinal vessel caliber.
In this study, prediction rates were moderate to high for all the cardiovascular risk factors, which was particularly true for diabetic status, the Syntax risk score, AHA risk score, and SCORE risk score. Even if patients with diabetes did not present with diabetic retinopathy, the prediction rates were high with the four algorithms for SIVA, OCT-A alone, and SIVA and OCT-A combined. These findings demonstrate that vascular changes are significant before incidence of diabetic retinopathy. [34][35][36] The present study is also valuable because of the good prediction rate not only for cardiovascular risk factors but also for wellestablished cardiovascular scores with Naïve Bayes and SIVA + OCT-A data: the AHA risk score 81.25%, SCORE risk 75.64%, and 96.53% for the Syntax score.
We compared the four algorithms with multiple test adjustments. Overall, Naïve Bayes and KNN gave a higher prediction rate compared to the discriminant analysis and the decision tree classification for every cardiovascular parameter. Comparison of SIVA, OCT-A alone, or combined did not demonstrate clear superiority of one technique over the others. Very different retinal vascular characteristics could be extracted from OCT-A and SIVA. With OCT-A, one could extract the fovea and capillary network features, whereas with SIVA from fundus photographs represent the analysis of venules and arterioles from the center of the optic disc and then to three successive peripheral zones. We aimed to combine both in order to give a thorough description of the retinal vascular network as a whole. Interestingly, SIVA and OCT-A combined did not give better predictive rate compared to SIVA or OCT-A alone. We could hypothesize that predictive vascular information is already high with one single device and that extra retinal quantitative data are not incremental. Multimodal imaging could be useful in future studies to predict continuous cardiovascular parameters and not with label (two or three in this study).
At the moment, machine and deep learning algorithms are more focused on fundus photographs. 10,37 We showed that applying ML Bayesian classifier on OCT-A quantitative data about retinal microvascularization (solo or combined) could also be of interest in predictive models.
We acknowledge several limitations to this study. First, one should remain very cautious regarding these findings given the small size of the input data set (144 patients, 5 retinal vascular parameters on OCT-A and 15 parameters with SIVA software). Second, the robust prediction rates could be related to the definition of two or three labels for each cardiovascular parameter. Third, we only considered one eye per patient. This selection was established on the basis of the quality of images creating a selection bias within each patient. Furthermore, if retinal quantitative parameters had been available for both eyes, we could have used one eye from the participant to train the model and used the other eye for validation. In future studies, we should consider each eye individually. Fourth, very high image acquisition quality is mandatory for this type of study. The predictive model performance could be impaired by the technical limitations of each imaging device (artifacts, segmentation abnormalities, and signal strength). Jammal et al. proposed a deep learning algorithm to detect segmentation errors on OCT scans for retinal nerve fiber layer measurement. 38 In future studies, this kind of approach could help us to improve the quality of the images' data set. Fifth, this was a cross-sectional study. Future cardiovascular events were unknown. As a consequence, we were not able to evaluate the incremental value of the ML predictive model based on retinal parameters compared to the usual risk score. Sixth, regarding OCT-A parameters, we only collected foveal vasculature structure (foveal avascular zone, vessel inner and full, and perfusion inner and full) by means of the Angioplex software (version 10; Carl Zeiss Meditec AG). Additional OCT-A retinal biomarkers, such as fractal dimension, could in the future improve our prediction rates. Seventh, we used the same data set for training the algorithms and for testing their prediction accuracy. In future studies, we could use an external dataset to improve our accuracy. Finally, our algorithm was applied to a population with a high cardiovascular risk profile. At the moment, it could not be replicated in a healthy population for primary prevention. Furthermore, a comparison group of healthy patients could help in future longitudinal studies to improve the algorithms' performance.
In conclusion, these preliminary findings demonstrate that ML algorithms applied to quantitative retinal vascular parameters with SIVA software and OCT-A show a good predictive performance of cardiovascular risk factors and cardiovascular risk scores. Quantitative retinal vascular biomarkers combined with an artificial intelligence strategy might be valuable data to implement a predictive model for cardiovascular parameters.