A Novel Machine Learning Algorithm to Automatically Predict Visual Outcomes in Intravitreal Ranibizumab-Treated Patients with Diabetic Macular Edema

Purpose: Artificial neural networks (ANNs) are one type of artificial intelligence. Here, we use an ANN-based machine learning algorithm to automatically predict visual outcomes after ranibizumab treatment in diabetic macular edema. Methods: Patient data were used to optimize ANNs for regression calculation. The target was established as the final visual acuity at 52, 78, or 104 weeks. The input baseline variables were sex, age, diabetes type or condition, systemic diseases, eye status and treatment time tables. Three groups were randomly devised to build, test and demonstrate the accuracy of the algorithms. Results: At 52, 78 and 104 weeks, 512, 483 and 464 eyes were included, respectively. For the training group, testing group and validation group, the respective correlation coefficients were 0.75, 0.77 and 0.70 (52 weeks); 0.79, 0.80 and 0.55 (78 weeks); and 0.83, 0.47 and 0.81 (104 weeks), while the mean standard errors of final visual acuity were 6.50, 6.11 and 6.40 (52 weeks); 5.91, 5.83 and 7.59; (78 weeks); and 5.39, 8.70 and 6.81 (104 weeks). Conclusions: Machine learning had good correlation coefficients for predicating prognosis with ranibizumab with just baseline characteristics. These models could be the useful clinical tools for prediction of success of the treatments.


Introduction
Diabetic macular edema (DME) is a major complication of diabetic retinopathy. The prevalence rate is 2-30% and increased risk in the population with poor diabetes control or longer diabetes years. It was characterized by thickening in the center of macula. The macula is the major area for vision quality. DME causes a severe vision disturbance and is a major cause of the blindness associated with the increasing prevalence of diabetes around the world [1,2]. In addition, diabetes-related and ocular comorbidities are more prevalent in patients with DME than in those without [3]. DME is associated with an ischemic change of retinal blood vessels that results in their release of a large volume of vascular endothelial growth factor (VEGF) [4,5]. These agents make the blood-retinal barrier disrupted and lead to serum or fluid accumulated. Optical coherence tomography (OCT) is a noninvasive and noncontact machine and provides qualitive and quantities information. The advantage of OCT makes it as a multifunctional tool to diagnosis and follow the progression of disease or the responses after treatments. According to the pathogenesis of DME, anti-VEGF agents, including aflibercept, bevacizumab and ranibizumab, are the most effective and safe choices for DME treatment. These medications can capture the VEGF agents, stop the serum or fluid leakage, reverse the thickening of macula and improve vision. More frequent injection of anti-VEGF agents is associated with better control over DME and improved visual acuity [6][7][8][9]. After one year of treatment, the mean vision improvement are 13.3 letters with aflibercept, 11.2 letters with ranibizumab and 9.7 letters bevacizumab. In Taiwan, due to the national insurance only allowed for five or eight injection for whole life, these agents are usually used as rescues and not significantly increasing vision outcomes. However, the risk of complications increases with the number of injections. Due to the complicated pathogenesis and multi-factors related prognosis, until now, there is still no a reliable prognostic factor to predict the final treatment outcome before any injection [10,11]. In this case, if we can establish a method to predict final visual outcome and customized the treatment plan, we can achieve a great improvement in the quality of care in the patients of DME.
Machine learning algorithms are increasingly being used for ophthalmology applications. Machine learning can use various approaches to analyze and summarize complex datasets for discovery of new knowledge [12,13]. Linear modelling, such as multiple regression analysis, often performs poorly when relationship between variables are nonlinear, as often observed in clinical settings [14]. Similar to the functioning of the human brain, artificial neural networks (ANNs) are comprised of different layers of "neurons" that are interconnected based on numerical weights. Each layer is trained using a regression analysis. Periodically, non-essential items are detected for removal using a validation set. In this manner, learning, thinking and testing are performed to optimize the weights to yield the best network [15]. ANNs can effectively manage a massive amount of information and use nonlinear modeling for calculation. The develop of ANNs by the collection of randomly chosen hidden units and analytically defined output weights make it can precisely predict the unknown relationship between input and output factors through repeated learning, validation and testing is continued until an acceptable regression is obtained [16][17][18]. Using ANNs but not linear statistical methods, Mehra predicted the pre-planting risk of stagonospora nodorum blotch, the optimal growth and culture conditions for maximum biomass accumulation and the incidence of dengue fever [19]. In addition, the use of ANNs can distinguish photographs of diabetic retinopathy and macular edema with retinal fundus [20][21][22]. Moreover, the accuracy rate could achieve up to 94.5% by deep convolutional neural network [23].
The National Institutes of Health-sponsored Diabetic Retinopathy Clinical Research Network (DRCR.net) provides a database of multicenter clinical research on diabetic retinopathy. Protocol I in the DRCR.net database involves a study on the use of an intravitreal injection of ranibizumab or triamcinolone acetonide in combination with laser photocoagulation for diabetic macular edema (DME) [7,24,25]. Here, in patients with DME that were treated with an intravitreal injection of ranibizumab, we use baseline patient characteristics to develop and test a machine learning algorithm to predict long-term visual outcomes. We used ANNs to build decision-support models to predict visual acuity in patients with DME at 52, 78 and 104 weeks after ranibizumab treatment.

Patients and Methods
For algorithm development and taking advantage of publicly available patient data through the National Institutes of Health-sponsored DRCR.net, which provides a database of multicenter clinical research on diabetic retinopathy treated in the USA, we retrospectively analyzed data from patients treated with the Protocol I in the DRCR.net database [7,24,25]. Protocol I, is a multicenter clinical trial that evaluated the use of an intravitreal injection of ranibizumab or triamcinolone acetonide in combination with laser photocoagulation for DME.
The dataset included information from participants who were followed up for 5 years to evaluate the long-term effects of ranibizumab. Based on the one-year results of the study, in April 2010, all participants regardless of randomization group were eligible to receive ranibizumab treatment. The initial 127 datasets included 674 patients who were administered an intravitreal injection of ranibizumab, corticosteroids, or vehicle. Only 454 patients received ranibizumab and were followed up for more than 52 weeks. Since this data was obtained from a publicly available data base (DRCR.net), our study was exempt from the requirement of informed patient consent.

Prediction Model Input and Target Output
Due to missing data, we used the patient data at 52, 78 and 104 weeks, which means one, one and half and two years. These three time points are the most used to comparing the effectiveness of DME treatments and following the progression of DME. The longest following duration in the dataset were two years. The input variables to build a prediction model were sex, age, diabetes type, insulin use, glycated hemoglobin (HbA1c) level, hypertension under treatment, hypercholesterolemia under treatment, lens status, degree of diabetic severity, the baseline macular OCT value (central point, central, inner and outer superior/nasal/inferior/temporal part), the timetable of ranibizumab treatment and baseline visual acuity (Table 1). Visual acuity at 52, 78 and 104 weeks was used to define the target output. The target outputs were continuous as early treatment diabetic retinopathy study (ETDRS) letters.

Machine Learning Development
To build the ANN, a complex learning procedure is used to identify data patterns. Once the algorithm is established, the ANN will verify the correlation coefficients of the algorithm through testing. Training group data were used for establishing and training the ANN models and validation and testing group data were used to evaluate the predictive performance of the well-trained models. In this manner, the results of training, validation and testing can sufficiently demonstrate the good generalization ability of the trained neural network. After these steps, the ANNs select the best network structure or algorithm with the most satisfactory performance. After this strict process of development, the algorithm can be applied to support a medical decision. We used a multiple-layer perceptron (MLP) model with a back propagation learning rule. This network model was composed of one input layer, one hidden layer and one output layer (Figure 1). The input layer was composed of the input variables and the output layer is the final visual outcome. Computation is performed on the hidden layer to establish the best connection between the input and output layers. The ANN models were developed using the ANN tool embedded in Statistica 10.0 (StatSoft, Inc., Tulsa, OK, USA). This tool also provided the embedded trial-and-error procedures with a changing number of hidden layers to produce the potential models for use. Cross validation was applied using 80% of the training group, 10% of the validation group and 10% of the testing group to determine the generalization of the models. Data subsets were chosen by randomly sampling from the set of all information. Correlation coefficients were determined for each MLP model. The mean standard error (SEM) reported was the standard error of the final visual outcome and the SEM was in reference to the ETDRS letter. The sensitivity analysis links the weight of each input variable to each MLP model. The greater the number, the greater the effect of that input feature. This ANN design did not allow for statistical significance values. the hidden layer to establish the best connection between the input and output layers. The ANN models were developed using the ANN tool embedded in Statistica 10.0 (StatSoft, Inc., Tulsa, OK, USA). This tool also provided the embedded trial-and-error procedures with a changing number of hidden layers to produce the potential models for use. Cross validation was applied using 80% of the training group, 10% of the validation group and 10% of the testing group to determine the generalization of the models. Data subsets were chosen by randomly sampling from the set of all information. Correlation coefficients were determined for each MLP model. The mean standard error (SEM) reported was the standard error of the final visual outcome and the SEM was in reference to the ETDRS letter. The sensitivity analysis links the weight of each input variable to each MLP model. The greater the number, the greater the effect of that input feature. This ANN design did not allow for statistical significance values.

Results
The most influential baseline parameters on post-treatment visual acuity were baseline visual acuity, lens status and the intravitreal injection (IVI) time table at 52, 78 and 104 weeks ( Table 2). For the target of visual acuity at 52 weeks, the MLP model had 58 input features, 21 hidden neurons and 1 output features (i.e., MLP 58-21-1). The correlation coefficients and SEM for the training, testing and validation groups were 0.75, 0.77 and 0.70 and 6.50, 6.11 and 6.40 ETDRS letters, respectively. The most related input variables ranked by sensitivity analysis for all three groups were visual acuity at baseline (2.07), lens status (1.44) and the application of injection at the fourth week ( Figure 2A).

Results
The most influential baseline parameters on post-treatment visual acuity were baseline visual acuity, lens status and the intravitreal injection (IVI) time table at 52, 78 and 104 weeks ( Table 2). For the target of visual acuity at 52 weeks, the MLP model had 58 input features, 21 hidden neurons and 1 output features (i.e., MLP 58-21-1). The correlation coefficients and SEM for the training, testing and validation groups were 0.75, 0.77 and 0.70 and 6.50, 6.11 and 6.40 ETDRS letters, respectively. The most related input variables ranked by sensitivity analysis for all three groups were visual acuity at baseline (2.07), lens status (1.44) and the application of injection at the fourth week (Figure 2A).    Figure 2C shows the regression model for visual acuity at 52, 78 and 104 weeks.

Discussion
Machine learning algorithms have been used for the prognosis of cancer, post-traumatic stress disorder, to detect brain pathology and to predict survival in patients with burns [26][27][28][29]; however, our study is the first to describe a novel machine learning algorithm that predicts visual outcomes in patients with diabetic macular edema who are treated with intravitreal ranibizumab.
We used patient baseline characteristics and ANN machine learning to build an algorithm that successfully predicts long-term visual acuity outcome as ETDRS letters in patients with DME who are treated with intravitreal ranibizumab. The final SEM of visual acuity prediction at 52, 78 and 104 weeks were 6.3, 6.4 and 7.0 ETDRS letters. The high correlation coefficients and low SEM of our results demonstrate that our automated system is a useful tool for informing treatment choice in patients with DME.
Several studies have investigated approaches for predicting DME treatment outcomes after IVI anti-VEGF therapy and report that an important factor is visual acuity at baseline [7,20,[30][31][32][33]. Our study using machine learning also found that baseline visual acuity had a significant role in predicting outcomes at three different time points. These results support the idea that early detection and aggressive treatment can improve visual outcomes in patients with DME. We also found that lens status is an important consideration in the prediction of visual treatment outcome. This finding may be related to the IVI procedure increasing the progression of cataract [34]. The progression of cataract in phakic eyes after IVI may influence the final visual outcome.
The IVI number is critically important for the treatment of DME. Many studies suggest that more frequent IVI injections yield better final visual acuity outcomes [6][7][8][9]. However, IVI is sometimes cost prohibitive and not suitable for all clinical settings [35]. In the present study, we use the IVI time points as pre-treatment input instead of the total numbers of injections, to ensure the general applicability of our algorithm.
In contrast to previous reports, we did not find that baseline OCT parameters influenced visual outcomes. This difference is likely because we only included baseline OCT measurements in our models, while prior studies also included follow-up OCT measurements [30,32,36,37].
Our study has several limitations. Protocol I used a focal laser adjunct treatment for DME. However, we did not include this variable as input because the power setting and treatment protocol for the focal laser treatment was too variable. Other limitations were the small number of patients in our study and the limited combination of treatment protocols. As a result, our machine learning approach may not predict the best IVI protocol for every patient. In the future, we will combine the protocol I data with data from other clinical settings to build a more elegant algorithm. In addition, we will use patient baseline variables and machine learning to predict the most efficient time table of treatment. The ratio of insulin using in the protocol I was higher than the real world. And it may not reflect the outcomes of patients using oral medications. It must be noted that adverse effects associated with bevacizumab administration in the protocol I were very rare, not more than 10% (include the corresponding reference here), therefore we did not include the variable adverse effects in our algorithm.

Conclusions
Using only patient baseline clinical characteristics data, we can create algorithms based on ANNs with good correlation coefficients for predicting final visual acuity at 52, 78 and 104 weeks in intravitreal ranibizumab-treated patients with DME. The SEM was 5-9 ETDRS letters or approximately 1-2 lines of vision. Our models may be useful in the clinic as tools aiding expectation and explanation. Further research using machine learning may improve the care and outcomes for patients with DME.