Identification of key biomarkers for early warning of diabetic retinopathy using BP neural network algorithm and hierarchical clustering analysis

Diabetic retinopathy is one of the most common microangiopathy in diabetes, essentially caused by abnormal blood glucose metabolism resulting from insufficient insulin secretion or reduced insulin activity. Epidemiological survey results show that about one third of diabetes patients have signs of diabetic retinopathy, and another third may suffer from serious retinopathy that threatens vision. However, the pathogenesis of diabetic retinopathy is still unclear, and there is no systematic method to detect the onset of the disease and effectively predict its occurrence. In this study, we used medical detection data from diabetic retinopathy patients to determine key biomarkers that induce disease onset through back propagation neural network algorithm and hierarchical clustering analysis, ultimately obtaining early warning signals of the disease. The key markers that induce diabetic retinopathy have been detected, which can also be used to explore the induction mechanism of disease occurrence and deliver strong warning signal before disease occurrence. We found that multiple clinical indicators that form key markers, such as glycated hemoglobin, serum uric acid, alanine aminotransferase are closely related to the occurrence of the disease. They respectively induced disease from the aspects of the individual lipid metabolism, cell oxidation reduction, bone metabolism and bone resorption and cell function of blood coagulation. The key markers that induce diabetic retinopathy complications do not act independently, but form a complete module to coordinate and work together before the onset of the disease, and transmit a strong warning signal. The key markers detected by this algorithm are more sensitive and effective in the early warning of disease. Hence, a new method related to key markers is proposed for the study of diabetic microvascular lesions. In clinical prediction and diagnosis, doctors can use key markers to give early warning of individual diseases and make early intervention.

predicting and preventing diabetic retinopathy has become a global public health issue that requires enhanced research and practice.
Up to now, diabetes has become the third chronic disease, which is after cardiovascular diseases, malignant tumor diseases.Diabetes is a clinically heterogeneous glucose intolerance syndrome, which is mainly due to the selectivity of immune mediated islet beta cell damage caused by a lack of insulin and glucose metabolism disorders.Chronic hyperglycemia in the body poses serious health risks, and leads to a form of retinopathy, which is known as diabetic retinopathy 4 .Compared with the research on type 1 diabetes and type 2 diabetes, the current studies on the pathogenesis and early warning of diabetes complications are relatively rare.Some scholars have been dedicated to researching the risk factors and related genes that lead to diabetic retinopathy 5,6 , and the prediction of diabetic retinopathy through retinal images [7][8][9] .Yun et al. investigate the metabolic features of diabetic retinopathy by using metabolomics profiling 10 .The study found significant differences in the concentrations of metabolites among different groups of diabetic retinopathies, and 16 metabolites were identified as common metabolites of non-proliferative diabetic retinopathy (NPDR) and proliferative diabetic retinopathy (PDR), among which three metabolites were found to be potential markers of diabetic retinopathy progression.The identification of these metabolic features will help to deepen our understanding of the mechanisms underlying diabetic retinopathy and provide a basis for the prevention and treatment of related diseases.Carmen et al. construct the linear relationship between the graduated indicators of the diabetes risk assessment model according to the theory of physique identification 11 .They investigated the feasibility of using complexity analysis of glucose profile to predict the development of type 2 diabetes in high-risk individuals.And they found that the complexity metric (detrended fluctuation analysis, DFA) calculated from 24-h glucose time series using DFA can significantly predict the development of type 2 diabetes in high-risk individuals.The results showed that DFA is a significant predictor of type 2 diabetes development, even after adjusting for other clinical and biochemical variables.This method has the potential to identify patients in need of intensified treatment and provide new insights for the prevention and management of diabetes.Shankar et al. constructed a diabetes risk assessment model to study the linear relationship between biochemical indicators, and proposes a deep learning-based automated detection and classification model for fundus diabetic retinopathy images 12 .Chakravarthy et al. discussed the diagnostic efficiency and accuracy of diabetic retinopathy based on artificial intelligence system, and proposed a framework DR-NET using stacked convolutional neural networks for diabetic retinopathy detection from digital fundus images 13 .Progress in artificial intelligence for diabetic retinopathy screening, including artificial intelligence applications in 'real-world settings' are summarized in Gunasekeran's research.The use of artificial intelligence models for diabetic retinopathy risk stratification, improved the efficiency of diabetic retinopathy management 14 .Somasundaram designed a machine learning bagging ensemble classifier to identify retinal features for diabetic retinopathy disease diagnosis and early detection using machine learning and ensemble classification method 15 .However, the previous studies on the risk warning of diabetic complications usually relied on the detection of a large number of biochemical indicators or gene sequences, which requires a more complicated detection and diagnosis process.
In order to reduce the cost of diabetes detection and save medical resources, an increasing number of studies are using artificial intelligence models for early warning of diabetic retinopathy risk, such as convolutional neural networks 16 , deep neural networks 17,18 , deep learning 19 , BP neural networks 20 , and so on.In this paper, we constructed a diabetes risk warning model based on the improved BP neural network to determine the key markers affecting the onset of diabetes.artificial neural network (ANN) was first proposed in 1943 and has been widely used in medical diagnosis, prognosis, survival analysis, clinical decision-making and other medical fields.BP neural network is a highly nonlinear mapping network, which can reveal the nonlinear relationship between the medical diagnostic indexes of T1DM patients.In recent years, scholars have applied neural network to the implementation prediction of blood glucose in insulin-dependent diabetes patients, and proved that its prediction effect is superior to other methods 21 .Su B established a prediction model by analyzing the relationship between diabetic retinopathy and related metabolic and biochemical indicators, and the experimental results showed that the model had a high accuracy in diabetes risk assessment 22 .This technology can help people to study and treat diseases by mining hidden and valuable information from the existing medical data.Moreover, it has few applications in medical treatment, especially in diabetes complications, so it is worth exploring its potential value in depth.
Currently, some studies have used convolutional neural network algorithms, data-driven methods, or embedded deep learning to construct diabetes detection models 23,24 .Dolly Dos et al. gave a detailed review on diabetic retinopathy (DR), its features, causes, machine learning models, deep learning models, challenges, comparisons, and future directions for early detection of Diabetic Retinopathy 25 .Huiqun Wu et al. proposed a back-propagation artificial neural network (BP-ANN) with a priori knowledge for early diabetic retinopathy detection.They compared the efficacy of this method with traditional BP networks and SVM, showing promising results in detecting DR at an early stage 26 .Sukran Yaman et al. addressed the challenges in diagnosing and screening diabetic retinopathy using deep learning models and discussed issues like unbalanced datasets and incorrect annotations.They presented a comparison study of state-of-the-art approaches for automated DR detection, highlighting the effectiveness of hybrid modeling strategies 27 .However, there is little research on early warning of diabetic retinopathy risk.Based on this, we will focus on the study of key markers of diabetic retinopathy and early warning of diabetes risk.Based on BP neural network theory, a new diabetes early warning model was established, which could predict the onset of diabetes by identifying the index value of key markers of the disease.The BP neural network algorithm is effective in early warning and diagnosing diabetic retinopathy by analyzing complex relationships between key markers.It can accurately detect warning signals before the disease occurs and optimize the network model for minimal error, identifying key markers for early disease warning through hierarchical cluster analysis.In clinical practice, we only need to detect the key markers and calculate the risk warning value of the key markers by using the algorithm, so as to further improve the identification of the key

Data
In this paper, the Data is from PHDA, a data warehouse of the National Population Health Science Data Center of China (NPHDC), CSTR: A0006.11.A0005.202006.001018.It is an internationally certified data warehouse certified by re3data and FAIRsharing.The dataset was accessed through a rigorous online application process, platform acceptance, and specific data service agreements from National Population Health Science Data Center whch approved our experiments.All experiments involving humans and the use of human tissue samples were conducted in accordance with relevant guidelines and regulations.All data used has been obtained with informed consent from the subjects or their legal guardians.The research subjects of this study were 200 patients with diabetic retinopathy, among whom the first 150 patients were sampled as the training set, and the remaining 50 patients were set as the test set.Statistics shows that there are 125 cases of male patients and 75 cases of female patients.The dataset contains 70 testing indicators.By eliminating factors with severe loss and strong noise in the data indicators, the final dataset contains 68 disease research indicators, including 41 basic physical condition indicators of patients and 27 laboratory testing indicators (19 for blood biochemistry, 4 for routine blood, and 4 for blood coagulation).
Due to lack of data set for a small amount of sample, we use K-Nearest Neighbor (KNN) algorithm to interpolate the missing data, and do minimum-maximum standardized processing for the integrity sample data.The data set contains 200 samples and each sample consists of 28 input parameters and 1 output parameter.In the following, the calculation will be done based on the standardized sample data.

Improved BP neural network theory
BP neural network can be used to study the relationship between the detection indexes and the risk of patients with diabetes complications.When the sample size or research indexes are large, the model fitting effect can be best achieved by adjusting the internal parameters of the algorithm, and the warning effect can be more reliable.The improved architecture of the BP neural network is shown in Fig. 1.
The 28 indicators data of diabetic patients were taken as input samples, and fasting glucose (GLU) was taken as output value.Specifically, the network includes 3 layers, namely, the input layer, the hidden layer and the output layer.The input parameter was set as X p (p = 1,2, • • • , j) , and the output parameter was X q ′(q = 1,2, • • • , k) .The linear transformation of X p was carried out to obtain the input HI i and output HO i of each node of the hidden layer, which changed as follows where W ij is the connection weight of the i th node of the hidden layer reaching the j th node of the input layer, φ is the excitation function of neural network.We choose the Sigmoid function, i.e., φ(x) = 1 1+e −x .In which x j represents the training data of the j th node in the input layer, θ i is the node threshold of the hidden layer.
When building a network, we try to use different node transfer function, training function, network learning function and performance analysis function.By comparing the network prediction errors of different settings, we can find when the prediction accuracy is highest, the function setting is as follows, The s-type functions 'tansig' and 'lodsig' are taken as node transfer functions, the momentum inverse gradient descent algorithm 'TrainLM' is taken as training function, the function driving quantity term 'trainlm' is taken as learning function, and the mean square error MSE is taken as network performance analysis function.
After the training and feedback adjustment of layers of nodes, the result O k of output nodes is obtained as follows where W ki ′ is the connection weight between the k th node of the output layer and the i th node of the hidden layer, and α k is the node threshold of the output layer.D k (x p ) is the actual value of the predicted sample node at the output layer, and the prediction error GE p is used to test the prediction accuracy of the model.

Disease early warning based on key markers
Through training for data by BP neural network, when network prediction rate reaches the optimal level, we record the weight of each index when signals transfer from input layer to the hidden layer.Next, we conduct hierarchical cluster analysis on the weights of these indexes and extract the higher-weight indexes from all indexes.Then, we study the internal correlation of higher-weight indexes and their correlation with low-weight indexes.Spearman correlation coefficient is conducted according to the following formula These higher-weight indexes are the key markers of disease warning that we studied.The traditional way to test for diabetes is to detect fasting venous blood glucose and 2H venous blood glucose (OGTT) by a single method, and key markers can be used as a new method to systematically warn the occurrence of such complex diseases.
In order to obtain strong warning signals for disease surveillance, we constructed the following warning index EWI, where marker SD refers to the score matrix of the key markers after standardization, which is represented by y ij , y j is the column average of the standardized scores of the key markers; Skewness marker SD represents the sample inclination of the standardized score of the key marker and is represented by s j .
(1) Standardized processing The original data of key markers are standardized according to rows to eliminate the dimensional influence between different index data, which is specifically calculated as follows (2) Arithmetic averaging the standardized score matrix is processed by column averaging to obtain y j , i.e., the comprehensive index of different individuals is calculated as follows

EWI = Mean(marker SD) Skewness(marker SD)
. As a key marker, EWI can show drastic fluctuations when individual indexes are abnormal, which is called strong warning signal.When a strong warning signal occurs, the correlation value between key markers is generally high, and much higher than the correlation coefficient between internal indexes of key markers and other indexes.This feature can be used as an important feature in identifying key markers.At the same time, when key markers are used for disease warning, EWI index can show stronger fluctuations than traditional detection indexes, and it will give a strong warning signal when there are abnormalities in individual sign data.

Model simulation based on BP neural network
The BP neural network model given in "Materials and methods" section has been used to train the data mentioned above (https:// www.ncmi.cn/).
By using BP neural network, we can research risk early warning of diabetic retinopathy (CSTR: A0006.11A0005.202006.001018)from 28 indexes, such as glycosylated hemoglobin, hemoglobin and triglyceride, total cholesterol, fasting glucose, and so on.These indexes come from the medical data of patients with diabetic retinopathy.The data is divided into training set and test set at a ratio of 75% to 25%, and the output of the network is GLU index value.
The model can predict the change trend of GLU in patients based on the data of 28 medical indexes detected clinically, and train the network with 75% of the data.When the model meets the preset accuracy, the remaining 25% of the data is used to test the performance of the model, as shown in Fig. 2.
Figure 2 shows the predicted and actual values of fasting glucose in 200 medical samples.We can see that the blue line and the red line in the figure almost coincide, indicating that the model can well predict the GLU level of individuals.Figure 3 shows the prediction error level of the model, with its relative error rate floating around the 0 level.The two figures show that the data fitting effect of this method is very good, and the prediction error is relatively low.
In order to further detect the key markers affecting the occurrence of diabetic complications (retinopathy), we carried out an in-depth study on the weight matrix of the neural network transmission layer.First, a threedimensional curved graph was used to show the weight relationship between 28 biochemical indicators, and the results were shown in Fig. 4.
In Fig. 4, straight square column presents the importance of the various indicators in the prediction of individual GLU curve.The figure shows that the higher weight of biochemical indexes include glycosylated hemoglobin (HBA1C), total cholesterol (TC), total protein (ALB).Glycosylated hemoglobin is the product of combining hemoglobin in red blood cells with sugars in serum, and is also commonly used as a test indicator for controlling diabetes.The TC value of total cholesterol is an independent risk factor for diabetic retinopathy (DR).The investigation results showed that the blood lipid index level could be timely adjusted according to the TC value of DR patients in clinical practice to prevent the occurrence of DR.Serum total protein can be divided into albumin and erythropoietin and has the physiological function of transporting a variety of metabolites and regulating the transported substances.However, the occurrence of diabetes is not caused by a single factor, but by the joint action of multiple indexes that constitute the key markers.Therefore, we used the weight matrix to further study the key markers for disease early warning.

Detection of key markers for early warning of diabetic complications
Hierarchical clustering, a method employed to discern underlying structures within complex datasets, was applied to analyze the weight data of 28 indicators.Initially, the weight data was meticulously sorted in a descending fashion to facilitate a comprehensive analysis.Subsequently, the Euclidean distance metric was leveraged to quantify the similarity between each pair of indicators, thereby establishing a robust foundation for the clustering process.The adoption of the shortest distance method in constructing the cluster tree was deliberate, as it aims to progressively amalgamate clusters based on the minimal distance between their constituent elements, fostering a hierarchical arrangement that accentuates the proximity relationships within the data.Notably, the identification of cluster distances falling within the uppermost 5% echelon was deemed pivotal in delineating a new module characterized by pronounced distinctiveness and variability, a pivotal step elucidated in the graphical representation depicted in Fig. 5.
Figure 5 shows the hierarchical clustering results, the ordinate is the clustering distance between the weights of biochemical indexes, and the abscess is the classification of biochemical indexes.The new module after clustering consists of 7 factors, namely, glycosylated hemoglobin, serum uric acid, hemoglobin, alanine aminotransferase,  glutamine transferase, alkaline phosphatase, and activated partial thromboplastin time (APTT).Changes in HbA1c level can affect the oxygen saturation of motor and venous blood in diabetic retinopathy, and long-term control of blood glucose may delay the progression of DR lesions.Serum uric acid is the final product of the catabolism of purine compounds, which reflects the efficiency of metabolism and decomposition of accounting and other purine compounds to some extent.Hemoglobin is a protein responsible for carrying oxygen in higher organisms.It can form hemoglobin A1c in contact with blood sugar, thus serving as an effective indicator for the detection of diabetes.Its importance has also been successfully verified in our key marker detection.Alanine aminotransferase has also been found to have a good effect in improving lipid metabolism, and has a synergistic effect with AST level.γ-Gamma-glutaminetransferase is a kind of oxidoreductase with glutathione function.Its elevated level may be the result of oxidative stress in the body of individuals, which can be used to indicate the loss of oxygen free radical activity to cells.Serum alkaline phosphatase mainly reflects whether the metabolic function of three substances in the body is normal.Studies have shown that bone specific alkaline phosphatase (BAP) is an important indicator of bone metabolism, and anti-tartrate acid phosphatase (BTRACP-5B) is a marker enzyme reflecting osteoclast activity and bone resorption.Therefore, biochemical indicators of bone conversion can be used as a reference for predicting diabetic retinopathy and its severity.PTT is one of the indicators that can reflect an individual's clotting function.Hyperglycemia can form glycosylation modifications on thrombin, leading to activation of the clotting mechanism.Shortening of APTT is considered a marker of hypercoagulability.Research has shown that the relationship between diabetic retinopathy and the state of coagulation activation can be understood by detecting the changes of coagulation function in diabetic patients.All the above important indicators related to diabetic retinopathy were detected in the key markers.The initiation of diabetic retinopathy is mainly induced by individual lipid metabolism, cell REDOX, influence on bone metabolism, bone resorption and cell coagulation function.
The identification of key markers for diabetic retinopathy, including factors such as glycosylated hemoglobin, serum uric acid, hemoglobin, alanine aminotransferase, gamma-glutamyl transferase, alkaline phosphatase, and activated partial thromboplastin time, is crucial for understanding the complex pathophysiological mechanisms underlying the condition.These markers have been selected based on their known associations with essential physiological processes like lipid metabolism, oxidative stress, bone metabolism, and coagulation function, all of which play significant roles in diabetic retinopathy development.By analyzing the interactions among these key markers, a comprehensive predictive model can be constructed to assess an individual's risk of developing diabetic retinopathy and predict disease progression.For example, glycosylated hemoglobin, a key indicator of long-term glucose control, influences blood oxygen saturation and may contribute to retinal hypoxia, a factor in diabetic retinopathy pathogenesis.Understanding the synergistic effects of these markers is essential for developing effective strategies for early detection and management of diabetic retinopathy.

Characteristic analysis of key markers
To validate the effectiveness of key markers in predicting disease progression, a correlation analysis was performed on the identified key indicators.The Spearman correlation coefficient was calculated to quantify the strength and direction of the monotonic relationship between the internal indicators of key markers and nonmarker indicators.This analysis aimed to assess the degree of association between these key markers and nonmarker indicators, providing insights into their interdependencies.A higher correlation value among key markers signifies their systemic relevance and predictability, particularly in the context of early warning for diabetic retinopathy.This systemic relevance indicates that changes in these key markers may serve as reliable indicators of disease progression or the emergence of complications, enhancing the predictive capability of these markers.By focusing on highly correlated key markers, more accurate and cost-effective early warnings for diabetic retinopathy can be achieved, ultimately leading to improved patient outcomes and reduced medical costs.The results were shown in Figs. 6 and 7. From Figs. 6 and 7, we can see that the correlation value of indicators in Fig. 6 is higher on the whole, indicating that the key markers are more systematic and correlated, while spearman correlation coefficient in Fig. 7 is lower on the whole with a small fluctuation range.Then we can conclude that the key markers are highly correlated and highly predictive.In the early warning of diabetic retinopathy, we can focus on the detection and research of key markers, which can save medical costs, and at the same time obtain a stronger warning than other indicators.This is a new and cheaper way to warn of diabetic retinopathy (complications).In summary, the use of key markers in a composite index for early warning of diabetic retinopathy presents a significant advancement in healthcare management.By providing a comprehensive assessment of various physiological factors related to the disease, these key markers offer a more cost-effective approach to detecting potential complications at an earlier stage.Despite the need for additional tests to compile the parameters used in the approach, the information gathered from these markers enables healthcare providers to intervene early and prevent the progression of diabetic retinopathy.This proactive strategy not only leads to cost savings by reducing the need for expensive treatments for advanced stages of the disease but also improves patient outcomes.Comparatively, the composite index of key markers surpasses the reliance on traditional fasting blood glucose levels for early disease detection.By incorporating multiple markers into the index, a more nuanced understanding of an individual's risk for developing diabetic retinopathy is obtained, allowing for timely intervention and improved disease management.The comprehensive assessment provided by the composite index leads to better outcomes, reduced healthcare costs, and the potential to prevent or delay the onset of diabetic retinopathy.Despite the initial costs associated with conducting multiple tests, the long-term benefits in terms of improved patient care and cost savings justify the investment in this advanced approach to disease detection and management.

Discussion
Diabetic retinopathy is one of the most common microvascular diseases in diabetes, which is essentially caused by abnormal blood glucose metabolism caused by insufficient insulin secretion or decreased activity.Its pathogenesis is complex, and it is affected by many factors in the development process.Therefore, in research, we should integrate multiple individual pathogenesis pathways.As a new way of disease warning, the key marker can detect the onset signal of the individual before the disease occurs.In this study, a novel algorithm was proposed based on BP neural network to detect the warning signal of diabetic retinopathy.This method not only saves the medical detecting cost, but also accelerates the medical efficiency of early warning and diagnosis.
In the paper, we applied the algorithm to the detection of key markers affecting pathogenesis and the early warning of disease onset.First, we introduced a disease warning method based on BP neural network, which studies how diabetic retinopathy is induced from the aspects of individual lipid metabolism, cell REDOX, bone metabolism and bone resorption, and cell clotting function.Second, according to individual medical test data, we identified key markers that induce diabetic retinopathy complications.In addition, the mechanism of action of the key markers was preliminarily determined, that is, the indicators in the key markers did not independently to induce the occurrence of the disease, but were highly correlated with each other.As shown in Fig. 5, they coordinated and acted together before the occurrence of the disease.At the same time, when conducting disease early warning, key markers can serve as a complete module to form a strong early warning signal when a disease occurs.As shown in Fig. 8, the key markers detected based on this algorithm have higher sensitivity and effectiveness in disease early warning.Finally, the disease warning method based on BP neural network provides a new method for the study of diabetic microvascular lesions.In clinical prediction and diagnosis, doctors can use key markers to give early warning of individual diseases and make early intervention.
The BP neural network used in this algorithm has a prominent ability to deal with complex relationships, and the network weight plays a very important role in the transmission process of nodes at each layer of the network.How to obtain, analyze and reasonably apply the weight matrix between each layer element is also a research feature of this paper.In this study, by adjusting the function and neuron parameters of BP neural network, the network model with the optimal training effect and the lowest experimental error was obtained.We can obtain the weight matrix under the network state, and perform hierarchical cluster analysis on the weight data of each medical index in the process of transmission.We select the molecular module with high weight and closest www.nature.com/scientificreports/clustering distance as the key marker of disease early warning.We study both the independent importance of individual signs in disease early warning and the coordinated correlation between them in inducing disease occurrence.This method has unique advantages in the study of diabetes complications.
It is worth noting that the gradient method utilized by the BP neural network may lead to challenges such as slow learning convergence and susceptibility to local minima.Additionally, the reliance on experiential judgment to determine learning and inertial factors for network convergence may impede the practical application of the BP neural network.To address these limitations, we plan to explore alternative optimization algorithms that can offer improved convergence rates and robustness against local minima.Furthermore, we will investigate strategies to automate the selection of learning and inertial factors, possibly through the use of hyperparameter tuning methods or automated machine learning techniques.By optimizing these parameters automatically, we aim to reduce the reliance on experiential judgment and facilitate the practical implementation of the BP neural network in real-world scenarios.

Conclusions
When studying disease early warning, traditional methods rely on linear model to study the linear relationship between two indexes.Based on the theory of BP neural network, in this paper, we studied the complex network relationship among the indexes that affect the occurrence of disease, and determined the key markers that induce the occurrence of diabetic retinopathy.In the study, we found that there was a strong correlation between key markers, while the correlation between internal indicators of key markers and non-marker indicators was low.It implies that indicators that become key markers in disease early warning are synergistic, which can form a complete module to transmit strong early warning signals before the occurrence of disease, and provide certain reference for clinical prediction and diagnosis.In studying the early warning of diabetic retinopathy, we found that clinical indicators that form key markers have been shown in many literatures to be closely related to the occurrence of the disease, such as glycosylated hemoglobin, serum uric acid, alanine aminotransferase, etc.They induce the occurrence of diseases from the aspects of individual lipid metabolism, cell redox, bone metabolism and bone resorption, as well as cell coagulation function.Therefore, this method can be used to detect key markers affecting diabetic complications, explore the induction mechanism of disease occurrence, and provide early warning for the occurrence of disease.Clinically, it is helpful for the intervention, diagnosis and treatment of diabetic complications.So.It can improve the quality of life of patients and reducing the mortality rate of diabetic complications.

2( 3 )
Vol.:(0123456789) Scientific Reports | (2024) 14:15108 | https://doi.org/10.1038/s41598-024-65694-xwww.nature.com/scientificreports/De-skew processing The data skew of the key marker index data of a single individual is calculated, which is a key step to enhance the sensitivity of the warning index to the data fluctuation.It is calculated as follows

Figure 2 .
Figure 2. Comparison of predicted value and actual value based on BP neural network.

Figure 3 .
Figure 3. Relative prediction error rate of fasting glucose based on neural network.

Figure 4 .
Figure 4. Significance recognition of indicators of diabetic retinopathy.

Figure 5 .
Figure 5. Detection of key markers that induce diabetic retinopathy.

Figure 8
Figure8shows the warning curve based on GLU data and the composite index of key markers respectively, in which red is the GLU fluctuation curve of 200 patients.If the traditional fasting blood glucose level is used to detect the occurrence of diabetes complications, the curve has a small trend of fluctuation, and it cannot play a good warning role for the occurrence of the disease.Abnormalities can only be detected after the onset of the disease.However, the composite index based on key markers to study disease early warning can show strong fluctuations during the individual's medical testing phase, as shown in the blue curve.In this figure, there are six strong fluctuation signals, which indicate that the corresponding individual samples show abnormal physical signs, namely, the upcoming diabetic retinopathy based on this data set.Key markers are more sensitive than other indicators.And it doesn't warn from a single perspective, it transmits warning signals to the occurrence

Figure 6 .
Figure 6.Internal correlation of key markers.

Figure 7 .
Figure 7. Correlation between key markers and non-key marker indicators.

Figure 8 .
Figure 8. Warning of diabetic retinopathy based on key markers.