The Application of Artificial Neural Networks and Logistic Regression in the Evaluation of Risk for Dry Eye after Vitrectomy

Supervised machine-learning (ML) models were employed to predict the occurrence of dry eye disease (DED) after vitrectomy in this study. The clinical data of 217 patients receiving vitrectomy from April 2017 to July 2018 were used as training dataset; the clinical data of 33 patients receiving vitrectomy from August 2018 to September 2018 were collected as validating dataset. The input features for ML training were selected based on the Delphi method and univariate logistic regression (LR). LR and artificial neural network (ANN) models were trained and subsequently used to predict the occurrence of DED in patients who underwent vitrectomy for the first time during the period. The area under the receiver operating characteristic curve (AUC-ROC) was used to evaluate the predictive accuracy of the ML models. The AUCs with use of the LR and ANN models were 0.741 and 0.786, respectively, suggesting satisfactory performance in predicting the occurrence of DED. When the two models were compared in terms of predictive power, the fitting effect of the ANN model was slightly superior to that of the LR model. In conclusion, both LR and ANN models may be used to accurately predict the occurrence of DED after vitrectomy.


Introduction
Vitrectomy, an ocular surgery performed to partially or completely remove the vitreous, is widely used to treat various ocular conditions, such as cloudy vitreous, vitreous haemorrhage, retinal detachment, and proliferative diabetic retinopathy [1][2][3]. Most vitrectomies are performed to facilitate surgery to address one of a variety of retinal conditions [3]. Some vitrectomies are conducted for diagnostic purposes. With advances in the instrumentation available, vitrectomy has become a well-established procedure. Serious vitrectomy-associated complications are very rare [1,3]; however, like other types of ocular surgeries, vitrectomy may traumatize the conjunctival tissues, often resulting in the development of secondary dry eye disease (DED) [4][5][6]. Using the demographic and clinical features of patients to predict risk for vitrectomy-related DED will facilitate decision-making in the management of vitrectomy patients and improve the relationship between doctors and patients. To the best of our knowledge, no previous study has investigated the prediction of risk for secondary DED in patients scheduled to undergo vitrectomy.
In recent years, machine learning has been widely applied to solve real-life problems, including issues related to healthcare [7][8][9]. In supervised machine learning, the algorithm learns a target function from labelled training data. e outcome value of a case in unlabeled new data can be calculated based on the target function. e two learning tasks required for supervised learning are classification and regression. Although several techniques have been developed for supervised learning, those used most widely in healthcare and medicine are logistic regression (LR) and artificial neural networks (ANNs) [9,10].
LR is a machine-learning technique borrowed from statistics. e logistic regression model, based on a logistic function, is used to express the relationship between multiple input features (independent variables) and a categorical dependent variable (outcome variable) and to predict the probability of a given outcome variable [8].
ANN is a nonlinear adaptive dynamic system that simulates biological nerve structure and consists of many processing units. It has become an important tool for predictive data applications [8]. In this study, we used logistic regression and an ANN to construct models for predicting the risk of secondary dry eye after vitrectomy. We evaluated the performance of these clinical prediction models in assessing the risk of secondary dry eye after vitrectomy in order to elucidate the mechanism of secondary dry eye after vitrectomy.

Patients.
is study was approved by the Ethics Committee of Tongji Medical College of Huazhong University of Science and Technology. All procedures performed in studies involving human participants were in accordance with the ethical standards of the institutional and national research committee and with the 1964 Helsinki Declaration and its later amendments or comparable ethical standards. Written informed consent was obtained from all individual participants included in the study.
We retrospectively reviewed the data of patients who underwent vitrectomy in the Ophthalmology Department of our hospital during the period from January 1, 2014, to July 31, 2018; the data from these patients were used to train the supervised ML models. We also prospectively studied the patients who underwent vitrectomy during the period from January 1, 2018, to September 1, 2018. e datasets from these patients were used to validate the ML models. e inclusion criteria for enrolment for training and validation of the ML models were as follows: (1) complete clinical data and clear outcomes; (2) age ≥18 years and ability to articulate one's feelings; (3) initial presentation at our institution; (4) history of complete vitrectomy; and (5) targeted diagnosis and treatment for the initial medical concern. e exclusion criteria included the following: (1) a voluntary request from the patient to terminate treatment during the perioperative period, followed by early discharge; (2) previous diagnosis with xerophthalmia (with or without treatment); (3) previous history of complications such as acute conjunctivitis, glaucoma, keratitis, ocular trauma, dacryocystitis, and systemic lupus erythematosus; (4) history of contact lens wear; (5) history of laser or other eye operations; (6) history of disease requiring long-term use of atropine, neostigmine, artificial tears, or other drugs that affect tear film stability; and (7) refusal to cooperate with the necessary examinations.

Feature Selection for ML.
Four representative ophthalmologists screened randomized patients with the Delphi method. Each patient was screened twice. e potentially relevant factors that were treated as categorical variables were gender, age, history of hypertension, history of diabetes mellitus, history of smoking, indoor work, occupation, daily exposure to computer or mobile phone screens, and driving conditions (driving time per day).
For the preoperative Schirmer I test (SIT), 5 mm filter paper was placed at a point 1/3 of the length of the lower conjunctival sac with respect to the medial canthus in the absence of ocular surface anesthesia. After the patient had gently closed his/her eyes for 5 minutes, the filter paper was taken out and the length of the wet filter paper was measured from the fold. To quantify preoperative BUT, sodium fluorescein solution was dripped into the conjunctival sac. After the patient had blinked several times, he/she was asked to look straight ahead. e patient was evaluated under wide-angle cobalt blue light from the slit lamp. e time from the last blink to appearance of the first black spot on the cornea was considered as tear film rupture time. After repeated measurements, the average value was obtained. For preoperative corneal fluorescein staining, fluorescein solution was dripped into the conjunctival sac of the eye that had been operated upon, which was then observed under the cobalt blue light from the slit lamp. e presence of any corneal epithelial defect was recorded. e range of corneal staining was scored as follows: corneal epithelial nonstaining, 0; dispersed fluorescence throughout the cornea, 1; slightly dense corneal staining, 2; and dense or flaky corneal staining, 2. Intraocular pressure (IP) was measured with a noncontact tonometer, with the range of normal values considered to be 10-21 mmHg. Corneal central thickness (CCT) of the operated eye was determined with an Orbscan II anterior segment analyzer.
Use of the Delphi method for analysis dictated exclusion of the following factors: history of hyperlipidemia, drinking history, educational background, correct reading posture, body mass index (BMI), duration of disease, proximity of contaminated buildings, long-term exposure to air-conditioning, and daily use of a mobile phone postoperatively.

LR Model.
Univariate analysis was performed to determine the regression coefficient for each potential influencing factor. e variables revealed to have statistical significance after univariate analysis were input to train the multivariate logistic regression model to establish the prediction equation. Stepwise regression analysis was used to eliminate variables for modeling and to observe whether there were statistical differences between variables in the goodness of fit. e Wald-2 test was performed to estimate the logistic regression equation and regression coefficient. e partial regression coefficient (B), standard error (S.E.), Wald statistics, and p value were obtained for the corresponding variables, and the multivariate logistic regression equation was constructed.

ANN
Model. e ANN model was used to analyze the relationship between secondary DED, diagnosed 3 months after surgery, and to identify potential risk factors for vitrectomy-associated DED. e neural network model was set using a multilayer perceptron neural network. e numbers of hidden layers and network neurons were automatically determined by network optimization. First, factors thought to increase risk for dry eye secondary to vitrectomy were extracted as input layer vectors. Second, we established a neural network model, which consisted of three layers: input and output layers on both sides and hidden layers in the middle. Each hidden layer comprised multiple layers. Finally, forward and backward propagation networks were trained. For forward propagation, independent variables were input into the neural network from the input layer and then passed through several hidden layers. Finally, the prediction results were output to the output layer. For back propagation, an error backpropagation algorithm and the gradient descent optimization method were used to adjust the weights of each network layer. Error information was obtained by comparing the output information and expected information. e chain derivation method was employed to obtain error information for each step. Each layer was propagated forward to obtain the corresponding error information, and the weight and bias of each layer were adjusted accordingly.
ANN training and validation were carried out using MATLAB 2012 software. e network type was feedforward backpropagation; the training function was trainlm; the learning function was learngdm; and the error performance function was mse. e tansig function was used to complete the transfer of each layer, and the training times were set at 1000.
Finally, the predictions obtained with the methods described above were used as test variables. Actual prognosis outcomes were used as state variables; 1-specificity was used as the abscissa; and sensitivity was used as the ordinate. en, the receiver operating characteristic curve (ROC curve) was drawn, and the area under the curve (AUC) and 95% CI were calculated. e binormal model was fitted according to the data results; the corresponding parameters were estimated with the maximum likelihood method; and the smooth ROC curve was obtained. e ROC curve was used to test the ML models in order to further clarify the impact value of each factor on the outcome.

Statistical
Analysis. Data analyses were conducted using the SPSS 21.0 software package, and differences were considered statistically significant when p < 0.05. e Kolmogorov-Smirnov test was used for the measurement data, and measurement data conforming to the normal distribution were expressed as mean (+SD/SEM). e independent sample t-test was used for between-group comparisons. Single-factor analysis of variance was used for multiple-group comparisons. Medians (M) and quartiles (Q25 and Q75) were used for measurement data that did not conform to the normal distribution.

LR Analysis for Predicting Risk for Vitrectomy-Associated
DED. Univariate analysis was performed with the chisquare test and the independent sample t-test. e factors correlated with secondary dry eye after surgery were gender (male), age, history of diabetes mellitus, history free of smoking, smoking more than 10 cigarettes per day, indoor work, daily exposure to computer and mobile phone preoperatively, preoperative BUT, and preoperative CCT (P < 0.05; Table 1). e dependent outcome variable, presence or absence of vitrectomy-associated DED, was binary. Variables associated with significant differences in univariate analysis were used as input features to train the logistic regression model by stepwise regression analysis. Significant differences between variables in goodness of fit were observed. e Wald-2 test was performed to estimate the logistic regression equation and the regression coefficient. e final independent influencing factors (all were risk factors), in order from most to least important, were age, history of diabetes mellitus, smoking more than 10 cigarettes per day, daily exposure to electronic screens preoperatively, preoperative BUT, and duration of surgery (P < 0.05; Table 2). e goodness-of-fit test of the multivariate logistic regression equation showed that χ2 � 8.083, DF � 7, and P � 0.374, suggesting satisfactory goodness-of-fit. e equation was as follows: Journal of Ophthalmology P � 1 1 + 0.240 − 1.612 + 0.753X 2 + 0.623X 3 + 1.130X 4 + 1.112X 6 + 0.286X 7 + 0.889X 9 , where X 2 is age; X 3 is history of diabetes mellitus; X 4 is smoking more than 10 cigarettes per day; X 6 is daily exposure to computer or mobile phone screen; X 7 is preoperative BUT; and X 9 is duration of surgery.

Predictive Accuracy of the LR Model.
We substituted specific values for the independent factors from the validation dataset into the formulas presented above. We compared the outcomes predicted by these formulas with the actual outcomes and then evaluated the predictive accuracy of the LR model. e performance of the LR model was tested by the ROC curve and showed AUC � 0.741, 95% CI � 0.611-0.870, and P < 0.05. ese findings suggest that the prediction model was effective in predicting the occurrence of postoperative DED secondary to vitrectomy (Figure 1).

ANN Model and Its Performance in Predicting Vitrectomy-Associated DED.
e neural network of multilayer perceptron was enhanced. e numbers of layers and neurons in hidden layers were determined automatically by network optimization. Potential influencing factors related to vitrectomy were used as input variables for the network model, with occurrence of vitrectomy-associated DED as output variable. As shown above, 217 subjects and 33 subjects were included in the training and test datasets, respectively. Analysis of the artificial neural network identified the following as influencing factors (independent variables) correlated with DED secondary to vitrectomy (in order from most to least important): age (100%), daily exposure to computer or mobile phone screen preoperatively (76.93%), preoperative BUT (69.18%), preoperative CCT (65.24%), and daily smoking (>10) (62.69%).
e ANN model is summarized in Table 3. e trained ANN model was used to test the validating dataset. e classification of the ANN model is shown in Table 4 via comparison of predicted vs. actual outcomes. As shown in Figure 2, the performance of the ANN model in predicting the occurrence of DED was tested by the ROC curve, with the following parameters: AUC � 0.786, 95% CI � 0.667-0.906, and P < 0.05. e ROC curve showed that the prediction model was effective for the prediction of secondary DED after vitrectomy.

Comparison of the LR and ANN Models in terms of Predictive Accuracy.
e ROC curves for the two models tested showed that the predictive accuracy of the ANN model (AUC � 0.786) was slightly better than that of the LR model (AUC � 0.741). Table 5 shows the detailed parameters of the ROC curves for both ML models.

Discussion
is is the first study to use supervised ML models to predict the risk of DED after vitrectomy. Both LR and ANN models were trained with the labelled data retrieved from previous cases of vitrectomy. Both models performed similarly in predicting the occurrence of DED, but the ANN model performed slightly better than the LR model. e results of this study demonstrate that the LR model and ANN model had similar predictive accuracy. e AUC of the ROC curve were 0.741 and 0.786, respectively, suggesting that the performance of the ANN model is slightly better than that of the LR model. Notably, both models identified age as the top risk factor for vitrectomy-associated DED. In addition, both LR and ANN models identified four common independent risk factors for DED after vitrectomy: age, smoking more than 10 cigarettes per day, daily exposure to computer or mobile phone screens preoperatively, and preoperative BUT. History of diabetes mellitus and surgical duration were identified as risk factors only by the LR model, while preoperative CCT was only identified as a risk factor for vitrectomy-associated DED by the ANN model. LR performs better for qualitative and semiquantitative (multiclassification) independent variables, while ANNs use either categorical or continuous variables as input. ese facts may explain the differences between the results provided by these two models. is indicates that the ANN model has superior predictive adaptability for use in clinical research.
Few studies have investigated the influencing factors that affect the occurrence of DED after vitrectomy. Our results are consistent with previous reports that age and surgical duration are risk factors for vitrectomy-related DED [12]. In addition, Banaee et al. (2008) reported that scleral depression significantly increased risk for DED after vitrectomy [5]. e tear film, which comprises lipid, tear, and mucin layers, nourishes the conjunctival epithelium and cornea, supplies lubrication to facilitate opening and closing of the eyelids, and provides a high-quality optical surface for the cornea [13,14]. e pathogenesis of DED includes inflammation, apoptosis of the lacrimal gland cells and conjunctival epithelial cells, and androgen imbalance [13,14]. e results of this study helped us to identify the mechanisms underlying DED after vitrectomy. First, the corneal epithelium and conjunctiva may be damaged by vitrectomy. After the operation, numerous factors may disturb the ocular surface, including scleral sutures, conjunctival sutures, incisions, and conjunctival edema. Corneal curvature may be affected, resulting in a decrease in tear film stability [15]. Importantly, it has been reported that basic fibroblast growth factor (alone or in combination with cytochrome c peroxidase) accelerates the healing of surgically damaged corneal epithelium [16,17]. We therefore sought to investigate the benefit of reducing the occurrence of DED in patients who had undergone vitrectomy. Second, vitrectomy-associated congestion and edema of the corneal tissue may affect the adhesion of mucin, allowing for the infiltration of inflammatory factors. is process can cause lacrimal gland damage, which exacerbates any corneal damage [18]. Prolonged surgical time can thus destroy the stability of the tear film and lead to secondary dry eye after    vitrectomy. ird, corneal goblet cells are more sensitive and vulnerable to external environmental factors, such as hyperglycemia (in diabetic patients). Metabolic disorders and nutritional disorders shorten the tear film rupture time and destroy the normal corneal morphology, thereby reducing the secretion of mucin. Finally, the eye drops often prescribed for patients after vitrectomy contain preservatives, which can affect corneal epithelial integrity, damage repair functions, and reduce the regularity of the corneal surface. All of these factors decrease tear film stability [19]. e analysis of clinical research data is challenging due to the complexity involved: on one hand, data need to meet the constraints of analytical models; on the other hand, the data characteristics need to be retained as far as possible in order to simulate the clinical situation [20]. It is therefore of great clinical significance to use the limited clinical data available for patients who have undergone vitrectomy for data analysis. Such a data-based approach will improve the data model for predicting secondary DED after vitrectomy and help physicians to identify risks early enough to communicate effectively with patients and to provide pertinent clinical interventions.
Logistic/Cox regression analysis is suitable for discriminating two or more classified variables, obtaining approximate estimates of relative risk and calculating their respective probabilities. Logistic/Cox regression analysis can be used to analyze most clinical data, but the flexibility and ease of use are ineffective for processing multiclass clinical data. With the increasingly close integration of computer science and applied mathematics with clinical medicine, more and more analytical and computational tools have been applied to clinical research. Various problems encountered by those performing clinical data analysis have been solved. As a digital model which imitates the functional structure of a biological neural network, the ANN model utilizes largescale nonlinear parallel processing and strong adaptability. e ANN model does not restrict the distribution of data, allowing researchers to make full use of data information. e ANN model has strong fault tolerance, so it can be widely used in the fields of prediction and analysis [21]. e ANN model also has better fit than the LR model. e identification of risk factors for vitrectomy-associated DED and the accurate prediction of secondary DED after vitrectomy by ML models will be helpful for clinical decision-making, as well as the management of patients who have undergone vitrectomy.

Conclusions
In conclusion, our study has shown that the LR and ANN models are similarly effective in predicting the occurrence of DED after vitrectomy. However, the ANN model better reflects the true relationship between input variables and the outcome variable.
Data Availability e datasets generated and analyzed during the present study are available from the corresponding author upon reasonable request. Disclosure e funders had no role in the study design, data collection and analysis, decision to publish, or preparation of the manuscript.

Conflicts of Interest
e authors declare that they have no conflicts of interest.