A Metabolism-Based Interpretable Machine Learning Prediction Model for Diabetic Retinopathy Risk: A Cross-Sectional Study in Chinese Patients with Type 2 Diabetes

The burden of diabetic retinopathy (DR) is increasing, and the sensitive biomarkers of the disease were not enough. Studies have found that the metabolic profile, such as amino acid (AA) and acylcarnitine (AcylCN), in the early stages of DR patients might have changed, indicating the potential of metabolites to become new biomarkers. We are amid to construct a metabolite-based prediction model for DR risk. This study was conducted on type 2 diabetes (T2D) patients with or without DR. Logistic regression and extreme gradient boosting (XGBoost) prediction models were constructed using the traditional clinical features and the screening features, respectively. Assessing the predictive power of the models in terms of both discrimination and calibration, the optimal model was interpreted using the Shapley Additive exPlanations (SHAP) to quantify the effect of features on prediction. Finally, the XGBoost model incorporating AA and AcylCN variables had the best comprehensive evaluation (ROCAUC = 0.82, PRAUC = 0.44, Brier score = 0.09). C18 : 1OH lower than 0.04 μmol/L, C18 : 1 lower than 0.70 μmol/L, threonine higher than 27.0 μmol/L, and tyrosine lower than 36.0 μmol/L were associated with an increased risk of developing DR. Phenylalanine higher than 52.0 μmol/L was associated with a decreased risk of developing DR. In conclusion, our study mainly used AAs and AcylCNs to construct an interpretable XGBoost model to predict the risk of developing DR in T2D patients which is beneficial in identifying high-risk groups and preventing or delaying the onset of DR. In addition, our study proposed possible risk cut-off values for DR of C18 : 1OH, C18 : 1, threonine, tyrosine, and phenylalanine.


Introduction
Diabetic retinopathy (DR) is a common and specific microvascular complication of diabetes and remains the leading cause of blindness in working-aged people [1]. In a metaanalysis of 41 studies done in Chinese people between 1990 and 2017, researchers estimated that the pooled prevalence rates were 18.45% for any DR in people with diabetes which revealed that DR has been a heavy public health problem in China [2]. DR is generally asymptomatic in the early course and irreversible in the late stages. However, the ther-apy of DR, such as antivascular endothelial growth factor (anti-VEGF) therapy, is usually only effective in the late stages, and not all patients respond optimally [3]. There is a pressing need for new screening and treatments to prevent DR and DR-associated blindness.
Metabolites are the biological products of genomic and proteomic perturbations and also can be influenced by environment such as diet and toxins [4]. Previous metabolomic studies revealed that branched-chain amino acid (BCAA) and fatty acid (FA) might be useful to monitor the development of insulin resistance [5]. For proliferative diabetic retinopathy (PDR) in diabetic patients, the AA and acylcarnitine (AcylCN) profiles were different from non-PDR patients [4,6,7]. In our previous studies, tyrosine, phenylalanine, and several long-chain AcylCN were also found to be correlated with DR [8,9]. Thus, AA and AcylCN may have potential as new biomarkers for predicting DR.
Machine learning (ML) algorithms have great generalizability and discrimination in high-dimensional data to analyze complex real-world data [10]. Some studies of predicting diabetes and its complication risk based on ML algorithms have performed great predictive results [11,12]. However, evaluation of those studies was not enough. Moreover, due to the black box problem of ML models, we cannot know exactly how the features in the model affect their judgments on disease classification.
In this study, we investigated the ability of ML models based on blood metabolites, AA, and AcylCN, to predict the DR risk in patients with T2D in northern China. Meanwhile, the best prediction model was explained by Shapley Additive exPlanation (SHAP) to quantify the effects of metabolites.

Research Design and Study
Patients. Details of the study population and methods were described previously [13]. A total of 1898 T2D inpatients in Liaoning Medical University First Affiliated Hospital (LMUFAH) from May 2015 to August 2016 were enrolled. Diagnosis criteria for T2D were the 1999 WHO's criteria [14] or treated with antidiabetic agent. For these patients, we retrieved their electronic medical records and measured their AA and AcylCN profiles. Excluding missing age, AA, and AcylCN information, a total of 1032 patients aged 18 years or older were included in the current analysis.

Data Collection and Definitions.
Electronic medical records provided the information on demographic, anthropometric, current status of smoking and alcohol drinking, duration of T2D, clinical and laboratory measurements, medicant used, and DR status. The clinical parameters included systolic blood pressure (SBP), diastolic blood pressure (DBP), triglyceride (TG), high-density lipoprotein (HDL) cholesterol, low-density lipoprotein (LDL) cholesterol, total cholesterol (TC), glycosylated hemoglobin (HbA1c), and serum creatinine (SCr). Medication information included antidiabetic agents, antihypertensive drugs, and lipid-lowering drugs.
2.4. Laboratory Assays. Details of metabolomic approach for AA and AcylCN were published previously [18]. Briefly, dry blood spot (DBS) samples of patients were collected after 8 hours of fasting. The sample preparation process involved punching wells from DBS paper, adding working and quality control (QC) solutions, derivatizing and drying, and finally, dissolving the dried sample in fresh mobile phase solution. An AB Sciex 4000 QTrap system (AB Sciex, Framingham, MA, USA) was used to carry out the mass spectrometry metabolomic analysis. The ion source was electrospray ionization source. Analyst v1.6.0 software (AB Sciex) was used for system control and data collection. Isotope-labeled internal standards of AA (NSK-A) and AcylCN (NSK-B) from Cambridge Isotope Laboratories (Tewksbury, MA, USA) were used for preparing working solutions. AAs and carnitine QC standards were provided by Chromsystems (Grafelfing, Germany). Acetonitrile (high-performance liquid chromatography grade) was obtained from Thermo Fisher (Waltham, MA, USA).

Statistical
Analysis. Continuous variables were reported as means with standard deviation (SD) or medians with interquartile ranges (IQRs), and categorical variables were reported as frequencies (%). Nonpaired Student's t-test, Mann-Whitney U test, and Chi-square tests (or Fisher's test, if appropriate) were conducted to estimate the differences of continuous and categorical data between patients with DR and not groups, respectively.
2.5.1. Data Preparation. Data preparation was done as following. Categorical variables were dummy encoded (one binary variable for each category), and numerical variables with >30% of missing values were replaced with missing value indicators, which specify whether a value was missing (1) or not (0). Other features containing missing values were processed using multiple interpolations. Multiple imputations are implemented through the mouse package in R, with the number of imputations being 5.
2.5.2. Feature Selection. Feature selection was performed by using least absolute shrinkage and selection operator (LASSO) regression. LASSO regression contains a regularization/penalty term in its cost function to prevent overfitting and ensure that the LASSO model neglects correlated features. According to the one standard error rule (1SE rule), the optimal value corresponds to the simplest model, and the cross-validation error of which is no more than one standard error above the minimum [19]. Therefore, we refer to the shrinkage parameter determined using the 1SE rule as the optimal value to obtain least features in this study. Journal of Diabetes Research study. The dataset was split into a training set and testing set at a ratio of 7 : 3 randomly. The best trained model was obtained in the training set using grid search. Moreover, we used the area under the curve (AUC), respectively, of the receiver operating characteristic (ROC) curve and precision recall (PR) curve analysis to evaluate its discrimination. The calibration curve and the Brier score were used to evaluate the calibration. The closer the calibration curve is to the reference line and the smaller the Brier score, the better the predictive calibration of the model. These evaluations were performed in the testing set.

Model
Interpretation. The effects of features on prediction scores were measured by SHAP, which assessed the importance of each feature using a game-theoretic approach based on the testing set [20]. The SHAP value quantifies the marginal contribution of each feature to the final prediction, which in our case is a T2D patient developing DR or not. In other words, SHAP value is the contribution of a given feature i in the prediction model and is the difference between the prediction using the value of i and the mean prediction [21]. P values of <0.05 were considered statistically significant in all these analyses. Data preparation and feature selection

Result
3.1. Characteristics of the Study Patients. The 1032 patients had a mean age of 57.2 (SD 13.8) years and a median duration of T2D of 5 (IQR 0-10) years. Of these patients, 53.2% were male. The mean BMI of the cohort was 25.3 (SD 3.9) kg/m 2 . Of these patients, 162 were with prior DR, while 870 were not. Subjects with DR composed more women and had longer duration of diabetes and higher SBP and TC. Age, BMI, DBP, current smoking and drinking, HbA1c, creatinine, and drug use were similar in the two groups (Table 1).
We observed 11 AAs and 19 AcylCNs demonstrating lower levels (P < 0:05) in T2D patients with DR. Other AAs and AcylCNs were similar in the two groups (see Supplementary  table 1 & table 2).
The evaluation of calibration in testing set have been shown in Figure 3 that XGBoost model 2 demonstrated the best agreement between prediction and observation. The Brier score of XGBoost model 2 was 0.09.

Model Interpretation.
A SHAP summary plot of XGBoost model 2 showed the importance of features (Figure 4). When the Shapley value of each feature exceeds zero, it indicates an increased risk of DR. The scatter colors in the graph reflect the magnitude of the eigenvalues (larger in red, smaller in blue). As shown in Figure 4, the duration of T2D, C18 : 1OH, phenylalanine, C18 : 1, threonine, TC, and tyrosine contributed more to the model. Furthermore, the high value of duration of T2D, threonine, and TC and low value of C18 : 1OH, phenylalanine, C18 : 1, and tyrosine corresponded to a Shapley value greater than zero. This suggested that these features were important risk factors to DR.
When most features are normal and for new-onset diabetes patients, the risk of developing DR is low ( Figure 5(a)). When the duration of T2D is longer and most features (C18 : 1OH, phenylalanine, C18 : 1, glutamate, and SBP) are abnormal, the risk of DR increases ( Figure 5(b)).
To further explain the impact of each risk factor on the DR forecast, we have created the SHAP dependence plots ( Figure 6). These features all showed clearer cut-off values. Duration of T2D more than 10 years, C18 : 1OH lower than 0.04 μmol/L, C18 : 1 lower than 0.70 μmol/L, threonine higher than 27.0 μmol/L, TC higher than 4.75 mmol/L, and tyrosine lower than 36.0 μmol/L is associated with increased risk of developing DR. When patients' blood level of phenylalanine is higher than 52.0 μmol/L, it might be associated with a decreased risk of developing DR.

Discussion
In this study, a machine learning DR risk model with good predictive ability was constructed, and the effect of the predictive features on DR was analyzed. The results showed that   [23,24]. Therefore, it is essential to explore new biomarkers and develop disease prediction models. Dagliati et al. used electronic medical record data to construct ML prediction models for diabetic complications [25]. Their study demonstrated that ML algorithms were powerful tools for clinical model development and medical data mining. However, electronic medical record data contains features from a wide range of sources, making it difficult to collect and prone to missing data. In the present study, the features incorporated into the model were simplified, relying primarily on AAs and AcylCNs for prediction of disease. Not only it is easy to implement, but it identifies the early stages of disease. The pathogenesis of DR is complex, involving a variety of endothelial dysfunction, inflammation, oxidative stress, and neural mechanisms [22,26]. In addition, DR patients also have the characteristics of disorders of glucose and lipid metabolism of patients with T2D. AAs and AcylCNs that have been incorporated into the ML model are likely to be involved in these pathogenic mechanisms.

Acylcarnitine in Type 2 Diabetes and Diabetic Retinopathy.
Fatty acid oxidation (FAO) defect and metabolic derangements, such as insulin resistance, could be screened by the mass spectrometric analysis of carnitine profile [27]. There are some studies shown that levels and abundance of several long-chain AcylCNs were significantly decreased in T2D and DR patients. Reasonable speculation for results was an inhibited carnitine palmitoyltransferase-1-(CPT1-) mediated entry of free fatty acid (FA) into mitochondria and impaired mitochondrial β-oxidation of retinal FA in the retina [28][29][30]. From this perspective, reduced levels of long-chain AcylCN might be considered as a biomarker for metabolic abnormalities or a risk factor for disease, which was consistent with our results. However, there were also inconsistent studies showing a higher level of AcylCN in T2D, which might due to incomplete oxidation of long-chain FA and altered tricarboxylic acid cycle (TCA) activity in patients with T2D [31].
Dyslipidemia and lip toxicity (accumulation of lipid metabolites) are increasingly recognized as important drivers of insulin resistance states [32]. Therefore, another focus of AcylCN in T2D is whether AcylCN via the impairments of FA oxidation reflect or inflict insulin resistance [27]. In the mitochondrial lipid metabolism, AcylCN not only prevents the accumulation of noxious acyl coenzyme A (CoA) but also reduces CoA trapping. It means that AcylCN formation allows continuation of CoA-dependent metabolic, such as the carnitine shuttle and FAO, processes [33]. In animal experiments, incomplete muscle FA β-oxidation leads to AcylCN accumulation and associated oxidative stress, which may be responsible for the development of muscle insulin resistance [34].
Due to the insufficiency of reports about AcylCN in the DR, current inferences were mainly based on T2D. For the unclear pathogenesis of the disease, whether these theories can be applied in the pathogenesis of DR requires further study. In addition, the wide variation in study populations (sex, ethnicity, genetics, and sample size) might have led to inconsistent results. 9 Journal of Diabetes Research secondary precursors of dopamine (DA) which rates of synthesis were influenced by local substrate concentrations and dysregulation were contributed to retinal neurodegeneration [35][36][37]. In our study, tyrosine lower than 36.0 μmol/L and phenylalanine lower than 52.0 μmol/L increased the risk of DR, which might be related to the mechanism mentioned above. Impaired L-threonine catabolism has been shown to promote methylglyoxal (MGO) accumulation [38]. MGO is a highly reactive aldehyde which associates to impairment of regulatory mechanisms of retinal blood flow and hyperpermeability of the bloodretinal barrier in DR [39]. This was consistent with the results of our study.
Besides AAs and AcylCNs, four factors-duration of T2D, TC, SBP and age-were included in the XGBoost model in this study. Of these, the duration of T2D had the greatest impact on the model classification and increased the risk of DR when the T2D lasts more than 10 years. Other studies and literature reviews also considered the duration of diabetes as an important risk factor for DR [33].
The inconsistency with some studies was shown in our model interpretation, and part of the reasons has been stated earlier. However, what cannot be ignored is that the SHAP summary plots describe the behavior of the (imperfect) predictive model and not necessarily the causal relationships between the variables [40].

Strengths and Limitations.
Because of the late onset of DR symptoms, the difficulty in detecting them, and the high demand for physicians and equipment for funduscopic examinations, diagnostic prescriptions are rarely issued when patients are asymptomatic. Our prediction model could remind doctors and patients to pay attention to the primary and secondary prevention of DR and increase the fundus screening rate of the high-risk groups. Secondly, our study again suggested that lipid and AA metabolism may have an important role in the development of DR. The combination of AA and AcylCN has the potential to be a new predictive marker of disease. Finally, with the SHAP interpretation, the features incorporated in our ML model are relatively homogenous in origin and easily accessible. This provides the basis for the development of predictive models for clinical application.
Our study also has the following limitations. First, this study was based on a cross-sectional study of inpatients and cannot infer the causal relationship between accumulation of AA and AcylCN and DR. Second, HbA1c was not included in this study due to its excessive deficiency. Third, this study did not evaluate the effects of specific drugs, such as metformin and sodium-glucose cotransporter 2 (SGLT2). But we considered the overall influence of antidiabetic agents, antihypertensive drugs, and lipidlowering drugs. Almost T2D patients have received treatment which might result in these features not being involved in the model. Moreover, this is a single-center study without independent external validation. In the future, we will try to develop models in larger scale data and explore the associations between metabolites with DR in prospective study.

Conclusions
In conclusion, our study mainly used AAs and AcylCNs to construct an interpretable XGBoost model to predict the risk of developing DR in patients with T2D. The aim is to identify high-risk groups and then to improve fundus screening rates in high-risk groups and reduce the burden of disease. In addition, our study proposed possible risk cut-off values for DR for C18 : 1OH, C18 : 1, threonine, tyrosine, and phenylalanine. The study is beneficial in preventing or delaying the onset of DR. In the future, larger prospective studies will be needed to validate this result.

Data Availability
Data are available upon reasonable request. Our study collected clinical information from the electronic medical systems retrospectively. Our data are not in a repository. If someone is interested in our data, please contact us via email (fangzhongze@tmu.edu.cn). One should include his detailed statistical analysis plans in his email.