IMPLEMENTATION OF CONVOLUTIONAL NEURAL NETWORK FOR PREDICTING GLAUCOMA FROM FUNDUS IMAGES

unknown what causes these diseases to develop, but if not treated, they result in optic nerve atrophy. For this reason, in this paper we propose a novel deep learning system for the automatic diagnosis of glaucoma using a convolutional neural network for classification, which demonstrates improved performance and records computation time for fundus images. The results showed an accuracy of 94 % and a loss value of only 0.27. The model we have created to investigate with Keras helped us achieve good results in our training and testing process. These study results demonstrate the ability of a deep learning model to identify glaucoma from fundus images. Increasing the filter size and training the model resulted in a higher accuracy rate. A population survey that was conducted in 2019 shows that most patients with glaucoma become aware of their disease late, after the disease causes a high level of optic nerve damage and a high percentage of vision loss. Early diagnosis and detection of glaucoma using optic nerve imaging technology have gained wide clinical interest in stopping or slowing the progression of the disease, allowing the development of new algorithms to automate the diagnosis of eye


Introduction
The balance of inflow and outflow of fluid in a healthy eye is constantly maintained by a certain pressure. With glaucoma, this circulation is disturbed, fluid accumulates, and intraocular pressure begins to rise. As a result, the optic nerve atrophies, and visual signals cease to flow to the brain. The consequence of increased pressure is the development of visual impairment and atrophy characteristic of glaucoma with excavation of the optic nerve.
In the Republic of Kazakhstan, men and women aged 40 to 70 undergo uniform preventive medical (screening) examinations every 2 years for early detection of glaucoma using diagnostic tests to measure intraocular pressure [1]. Considering that glaucoma is one of the most severe eye diseases characterized by rapid progression and irreversible blindness, up to 30-50 % of retinal ganglion cells can be lost in 2 years before it is detected by standard visual field testing [2]. While studying the ophthalmology department in the Republic of Kazakhstan, we identified several problems that doctors face. The first reason is that the doctor's examination is carried out on only one parameter. Currently, the only screening test for glaucoma in the Republic of Kazakhstan is the measurement of intraocular pressure. However, an analysis of previous works [3,4] shows that measurement of intraocular pressure is not enough to combat glaucoma -it is necessary to increase the level of knowledge of tests that can be quickly and easily applied to the target population in order to presumably identify an unrecognized disease in a healthy, asymptomatic population as a whole. For this reason, there was a need to apply another parameter that could improve the efficiency of glaucoma detection in the future. As such a parameter, the fundus was chosen, which, according to the results of foreign studies [5][6][7], is most suitable for showing a high result in data processing. The second reason is the lack of modern equipment. The screening model currently in use is outdated, due to the lack of modern models that could help doctors speed up the process, they have to manually check the results of patients in order to make a diagnosis. Therefore, to improve the efficiency of doctors, it is necessary to introduce a learning network in order to expand the possibilities of monitoring and treatment of the disease. For the automatic diagnosis of glaucoma, it is necessary to choose the right architecture for a trained neural network using different machine learning algorithms [8]. By building an efficient, accurate, and easy-to-use automated diagnostic system and testing an interpretable classifier capable of distinguishing between normal eyes and glaucoma using machine learning techniques, we can help ophthalmologists move from manual to automated testing. Thus improve the diagnostic accuracy of detecting changes in time and prevent blindness in patients. Therefore, studies of the types and parameters of the future model are directly proportional to the work of the neural network in detecting glaucoma, which is relevant for medicine. This disease is the second most common cause of blindness worldwide and can occur at any age. We expect that by applying machine learning techniques to detect and monitor glaucoma, we can help improve The methodology [11] using the vertical cup-disc ratio (VCDR) as one of the risk factors was proposed in order to identify glaucoma. Using images of the fundus, the radius from the image size to the optic nerve head (ONH) was determined with an interval of 10 to 60 %. In their research, they proved that deep learning could detect glaucoma from areas of the fundus image outside the ONH and 95 % confidence intervals (CI) for all indicators were obtained using this kind of application. A deep learning algorithm [12] was trained on OCT (Machine to Machine, M2M) data to predict RNFL thickness from fundus photographs. The aim of the study was to determine whether basal and longitudinal changes in RNFL M2M thickness could predict the development of defects in the visual areas. The average rate of thickness change of retinal nerve fiber layer (RNFL) M2M was -1.02 μm/year. After adjusting for potential confounding variables such as age, gender, race, baseline PAP, and mean IOP, baseline and longitudinal changes in RNFL thickness predictions by M2M still predicted future conversion to glaucoma. Moreover, eyes that had low M2M RNFL thickness predictions over time had a significantly higher likelihood of signs to change to glaucoma. A study that proposed to use a self-organizing neural network (Self-ONN) [13] to identify glaucoma at an early stage in the fundus was suggested. During training, Self-ONN showed excellent performance by reducing the computational complexity when training the model with small datasets. The classifier achieved a significant performance gap of 8-12 % on the F1 scale compared to equivalent CNN models and even deep CNN models for the three glaucoma reference datasets [13]. However, due to various shifts in data distribution, the performance of a well-behaved model can drop dramatically when deployed in a new environment [14]. Recent research [14] proposes a new DAFA method for feature alignment based on data augmentation (DA), which smoothes features and removes shifts in distributed data in fundus images. The set included 7 data with mean net area under the curve (mcAUC) to assess model accuracy. The experiment showed that DAFA can continuously and significantly improve regardless of training data and outperform current OOD generalization methods. The method proposed in [15] introduced a structure to improve fundus analysis results using CNN with robust design of experiment (DOE) and Retinex theory. The difference of the system was the simplicity of the CNN model compared to deep networks like GoogleLeNet and ResNet152. The result of the experiment surpassed all approaches in efficiency in the automatic detection of glaucoma with sensitivity, specificity, and accuracy by an average of 0.97 and errors of less than 0.03. Later, five different models [16] that were trained on the fundus using ImageNet (VGG16, VGG19, InceptionV3, ResNet50 and Xception) were used achieving 96 % data accuracy. Using a cross-validation strategy, ImageNet was found to be a reliable system for classifying glaucoma [17]. During testing, it was found that a finely tuned model can be competitive in performance when tested using different databases.

IMPLEMENTATION
An improvement in disc segmentation has been presented, stating that the presence of a calyx in the disc is a sign of glaucoma. The method [18] they provided determines the correct location of the cup by detecting the edge and segmenting the vessels. The accuracy obtained after segmentation brought the result to 95 %, which overcomes the the efficiency of doctors and reduce the progression of the disease in the population.

Literature review and problem statement
There are various ways to detect glaucoma, including different types of algorithms, features, and technologies. Image segmentation is performed with pixel-perfect accuracy, so it plays an important role in tasks such as the recognition of medical tests. Manual verification of glaucoma using OCT images of the fundus is a very time-consuming task, the effectiveness of which depends on the experience of an ophthalmologist, so using the architecture of residual neural networks will be useful to improve the performance of multilevel neural networks. After examining each of the studies, we divided them into convolutional neural network architecture types such as fully convolutional network, residual network, and other types.
Localization and segmentation of the optic nerve head (OD) on fundus images is an important step in identifying the early onset of retinal diseases such as macular degeneration, diabetic retinopathy, glaucoma [5]. The paper [5] offers a neural network for OD segmentation by modifying the DeepLab v3+ and U-Net architectures. Comparing the results with other architectures, they found that attention and сorticotropin-releasing factor (CRF) mechanisms can improve the performance of models for OD segmentation. Along with these techniques, the work [6] developed an ensemble-based tool with two subsystems in which the first one accepts segmentation and fundus methods to train the model, the second one transfers the data into a trained CNN, then the tool combines the data and provides it to the doctor. The result, compared to other works, showed that the subsystems yielded a high classification rate by providing enough information about the diagnosis. There are two problems [7] studied during the definition of glaucoma, such as training multiple datasets to increase segmentation independence from the source and efficient segmentation performance. During the study, they developed a U-network with a processor that is being tested for the first time on a Google TPU for learning and predicting segmental optic disc and cup, which is used as a parameter for glaucoma detection. As a result, a very small number of alternatives provided good quality cup segmentation while keeping the number of network parameters at an acceptable level [7]. In the future, DL may increase productivity in ophthalmology, as described in [9], which provided an overview of the DL system using OCT images. There are several reasons why DL has an advantage in the OCT system, the first being the modality for DL where the OCT of the macula is assembled to train deep learning systems that can help in the convergence of multilayer networks. Second, OCT contains a three-dimensional structure with information compared to color photographs of the fundus, in which the fixation between the field of view and the macula does not change during data processing.
The most important quality is that OCT allows you to see all the fine details in the fundus, opening the possibility to determine new qualities in the definition of glaucoma in the fundus. In [10], a 13-layer CNN using the Keras deep learning library with tensorflow as a back end is proposed for automated glaucoma diagnosis. CNN performed significantly better in classifying visual images than analog data schemes, resulting in a model with the SoftMax classifier achieving problems associated with noise, macula generation or data collection with tools. The study [19] discussed different approaches to the definition of glaucoma, where the optic disc and optic cup were segmented using textural features and fusion in machine learning. It was noticed that if the patient has many diseases, then this can also be reflected in the performance of the algorithm. The peculiarity of the optic nerve [20] as the disease includes loss of the neuroretinal rim, focal notches, OD asymmetry, PPA, appearance of changes in the blood vessels was considered. This also explains why computerized methods are effective using retinal fundus imaging, which includes NFLD detection, ONH analysis, CDR determination on conventional and stereo photographs. Advances in artificial intelligence (AI) and the application of deep learning to ophthalmic image analysis have led to remarkable advances in various fundus color photograph datasets. The bright future of automated AI diagnostic systems can identify patients with early GON, prevent permanent blindness in asymptomatic patients, make expert-level diagnoses in resource-limited settings, and scale to population-wide screening systems [21]. In [21], the performance of the algorithm for GON on fundus images was evaluated. The results of the experiment showed that the deep learning model has a great potential for detecting GON, where the specificity was 0.98. Early diagnosis of GON using automated tools can play a critical role in supporting the ophthalmologist, as patients with glaucoma are not aware of their condition until central visual acuity is impaired [21]. Moreover, the methods proposed in [22] presented a transfer method for training a CNN model that used two different approaches to extract glaucoma features without processing and enhancement steps. In addition, they suggested that the model could not only detect glaucoma but also other diseases by analyzing the retina and determining the severity according to the classification of the retina. The work [23] proposed a system of stages that include the location and height of the optic nerve head, cups for calculating VCDR from fundus images. They compared the results from VCDR and ophthalmologists and found a difference of 0.11. This experiment can be used to identify effective markers and signs of retinal anomalies, as it achieved an AUC of 0.93 for manually annotated VCDR.

The aim and objectives of the study
The aim of the study is to introduce a trainable artificial intelligence network into medical diagnostics, helping medical professionals improve the accuracy of disease diagnosis and reduce the progression and degeneration of retinal ganglion cells in patients.
To achieve this aim, the following objectives are accomplished: -to develop an interpretable classifier capable of distinguishing healthy eyes from glaucoma based on fundus characteristics; -to determine the accuracy and efficiency of the neural network model.

1. Object and hypothesis of the study
This paper presents a deep learning architecture for automatic glaucoma detection where the model incorporates the concept of machine learning and image classification using the U-Net architecture. The CNN model was trained using 2,392 positive glaucoma images and 3,432 non-glaucoma images included in the RGB color model. Thus, the developed system provides an efficient classification model for defining glaucoma as benign or malignant. Part of the study materials are patients with glaucoma aged 40 to 70 among patients of the Medical Center "Central Family Clinic", the other part was taken from the LAG [24] database, from which clinical and optical data were collected. Similar clinical and optical performance will be obtained from healthy participants of the same age in the control group.

2. Dataset
As mentioned above, in this survey, we combined two sources, LAG database and the database of the Central Family Clinic Medical Center as seen in Fig. 1 and collected a total of 5,824 fundus images, which are additionally labeled with areas of interest based on an alternative eye tracking method, in which 2,392 are positive glaucoma, and the remaining 3,432 -negative glaucoma. In which all samples are labeled with diagnostic results (0 means no glaucoma, 1 means suspected glaucoma). Fig. 1 shows sample images from the database that we used in the course of work. In order to conduct a study on the topic "Application of a convolutional neural network for predicting pathologies from images", a certificate was taken for obtaining, storing, using personal data of patients that do not contradict the legislation of the Republic of Kazakhstan. Some of the images that were used in the training and testing model were taken from this data collection provided to us by the Central Family Clinic Medical Center. See Appendix A for more information on the availability of a certificate.

3. Data preparation
There were enough images in the database to train the model, so we did not use augmentation. Therefore, the images were divided into two folders depending on the detected features. The dataset contained 1,182 images of non-glaucoma eyes and 862 images of eyes with glaucoma, which gives a base accuracy of 0.578. In the work, data normalization was performed to leave the frequency distribution of the data in the form of a normal distribution or make it more symmetrical and eliminate errors in the data. As a result, the dataset was split into 70 % for training and 30 % for testing machine

Convolutional Neural Network architecture
In this investigation, Keras Conv2D was used to train the Convolutional Neural Network (CNN), which is based on the U-Net architecture. The input layer contained the raw values as image pixels, with a width of 500, a height of 500 and three-color channels R, G, B. Then, in the convolutional layer studied a total of 16 filters with a kernel size of 3x3, then the maximum pooling was used to reduce the spatial dimensions. We have used ReLU as the elementwise activation function for each layer. The rectified linear activation function will output the input immediately if the result is positive, otherwise it will output zero. For this reason, we used this model as it is easier to train and often provides better performance. At the input layer, the network receives raw pixel data, which is always noisy, especially image data. The greater the number of filters, the more abstractions the network can extract from the image data. Because of this, the CNN will first extract some information from the noisy ones, then once the useful features are extracted, the CNN starts developing more complex abstractions. For this reason, we further increased the number of filters in the convolutional layer by 32, then 64 after each layer maximum pooling was used. Different from the original U-Net architecture [25], we use zero padding to keep the output dimension for all the convolutional layers of both down-sampling and up-sampling paths. The next step was smoothing, converting the data into a one-dimensional array for input into the next layer. The data transformation creates one feature vector that will be associated with the last fully connected layer. After a fully connected layer, the values are stored in the output layer, where an activation function has been applied to reveal the result. To minimize the cost function with respect to its parameters, stochastic gradient optimization is needed to train deep neural networks. Adaptive Moment Estimation (Adam) was applied to estimate the parameters. Adam was used for the first and second parts of the gradients to update and fix the moving average of the current gradients. Adam parameters were set as: learning rate=0.0001 and max epochs=120, which was divided into 4 parts of 30 for each training model.

1. The process of creating an interpretable classifier
In our inquiry, we have used 1182 images from the Medical Center "Central Family Clinic" and LAG databases, which were divided into 2 sections with and without glaucoma. To test and train we also separated these images into 70 % for training and 30 % for testing proportions. The model used 500×500 pixel normalized data, Keras Conv2D Convolutional Neural Network training was created as a model. Which used ReLU and an activation function during training, increasing the filter size by 2 each time, applying max polling. After training the model, we have gotten a training accuracy of 93.75 % with a test size up to 128 so that test and train sizes are divisible by 32. Evaluation scores are shown and determined in Fig. 2. Precision is a probabilistic measure that allows us to determine if the result in our model is correctly defined. A review is also provided where one can see the probability of a classified positive class and score for F1, which calculates the geometric mean between precision and recall. The F1 score is calculated as the geometric mean between precision and recall [26] precision recall 1 .
The last parameter in the images is support, in which you can see the number of samples of positive answers. Based on the classification results, you can see that the precision ranged from 90 % to 95 %, F1 score between 91 % and 94 %, showing up to 94 % accuracy at the end of classification.
Early stopping requires the validation dataset to be evaluated during training. Keras supports early stopping of training with the EarlyStopping callback. It shows the performance for monitoring, and after execution, it stops the learning process. Fig. 3 shows the training process, which continued training the model until 25, only then settled with a result of 0.9375.
Curves of accuracy and loss of training and testing obtained while training are presented in Fig. 4, 5, respectively. As shown in Fig. 4, the accuracy model starts at a low value (0.63) and increases (0.94) as training continues. This behavior is like the standard training procedure for deep learning [27]. In Fig. 5, the loss model for successive epochs starts at a high value and each time begins to decrease to a loss of 0.27. The accuracy of the model increased with each increase in the number of epochs, which finally reached our desired result, which indicates that the correctly chosen characteristic and model of the algorithms will help improve the performance and accuracy of glaucoma detection. Fig. 6 shows visualization results of multiple feature maps from the first convolutional layer in the convolutional model. Where it can be seen that the model determined the area without damage to the fundus of the eye (Fig. 6, a) and also the area with damage on the other part of the image (Fig. 6, b).
It is worth noting that our model and the experiment that we set up can also be used in other studies, such as cardiological, pulmonary, and others, since the findings of these diseases are similarly examined during screening, using images of certain places and by their segmentation. Convolutional Neural Networks are successful for simple images, but do not give good results for complex images. It is for this reason that we have considered other algorithms such as U-Net and Res-Net and other networks. Given that Self-ONN, ONH and Res-net algorithms have their own advantages, some factors must be considered. Self-ONN analysis, ONH analysis and Res-net analysis showed that by reducing the computational complexity, models can only be trained on small datasets. Moreover, due to various shifts in data distribution, the performance of a well-behaved model can drop dramatically when deployed to a new environment and reduce data accuracy. The U-Net (fully convolutional network) had several advantages over other networks. In various literature studies, subsystems using the U-network yielded a high classification rate, providing enough information about the diagnosis, allowing you to work with a large amount of data. It also provided good quality segmentation while keeping the number of network parameters at an acceptable level. The most important quality is that the U-Net allows you to see all the fine details in the fundus, opening the possibility of determining new qualities in the definition of glaucoma in the fundus. It should also be noted that the CNN built with the U-Net was much better at classifying visual images, resulting in the model with the classifier achieving 95 % accuracy. Considering all these signs, we chose the U-Net as a model for creating a neural network.

2. Performance Comparison of Other
Architectures I n the model performance, to confirm the result of the model, it is necessary to evaluate how efficiently the image classifier works. Almost all studies use performance evaluations with metrics such as accuracy, sensitivity, precision, and specificity. These are standard evaluation metrics that are used to determine the reliability and accuracy of the model. The evaluation was carried out using the method of checking image data with different features of glaucoma. We tested three different areas in each patient, such as the retinal fundus area, disc cup area, and optical disc area.
For each area of the eye with different signs of glaucoma, segmentation was applied and assessed using dice similarity coefficient (DSC) (2). DSC in this case was used to provide comparisons between the areas of glaucoma that we took as a test and the segmentation results of our automated method that is in which the measurements TP, FP, FN are determined as true positive, false positive and false negative, respectively. In order to determine the accuracy of our work and that the model performs much more efficiently than other CNN architectures, we made data performance comparisons, CIFAR100 image classification using ResNet9 [28], regularization and data augmentation in PyTorch was used as a comparison architecture. According to Table 1, we can conclude that the U-Net CNN architecture that we developed takes a lot of time to train due to the number of layers and settings with different filters, however, it gives better results in terms of segmentation and data accuracy. In our training experiments, the network was implemented in Keras under Tensorflow 2.9.0 with an Intel Xeon X3440 CPU 2.53 GHz, Lynnfield core, where we trained 100 images for comparative analysis of CNN architectures.

Discussion of study results of fundus classification
The main accomplishment of this research is that the algorithm works properly and classifies images. This means that even in the absence of glaucoma, the algorithm can accurately predict negative outcomes. Previous research [27] has shown that the accuracy of fundus image classification obtained using a fully convolutional network is significantly higher than that obtained using classical machine learning methods. In addition, various methods of glaucoma detection training that have focused on characteristics such as RNFL thickness, optic nerve, fundus, retinal area, and others, as previously mentioned were considered. Because of the fundus parameter, we achieved an accuracy of more than 94 % in this work. This parameter allowed us to consider the area of occurrence of any glaucoma symptoms and determine the disease's localization. Fig. 3 shows how we trained the model for 25-30 minutes with each feature layer to reveal accuracy and data loss.
The DL algorithm [29] was trained to evaluate fundus photographs and predict classification; the average probability of DL-glaucoma reached 99.8 %. In comparison to other works, the fundus yielded the best results. The fundus is used in our work for the first time for this reason. We have achieved a result of nearly 94 % since we began researching and applying the fundus in our model. If we had chosen different characteristics, the result could not have been higher than 85 %, as in previous studies [8]. The second reason is a fully connected layer during training that describes unequal features. In [11], these features were proposed, and their learning system is based on the CNN architecture and employs flexible and overlapping levels of standardization. Unlike the three-channel RGB image, the spectral image [30] is used to obtain images with different wavelengths.
Glaucoma is the world's second leading cause of blindness and can strike at any age. The disease affects more than 70 million people worldwide, with about 10 % suffering from bilateral blindness, making it the leading cause of irreversible blindness in the world [31]. There is still no consensus on the etiology of these conditions, but if left untreated, they will cause optic nerve muscle wastage and visual impairment. During the observation, it was noticed that not every sign of glaucoma is accompanied by additional characteristics, therefore, not only doctors always manage to make an accurate diagnosis, but certain signs are not noticeable to the patients themselves, which eventually leads to a deterioration in vision to blindness.
Working with the data proved to be the most difficult aspect of training the model. Deep learning models use a large dataset to learn the features required to make accurate predictions. To some extent, sharing resources between ophthalmologists and machine learning experts will help solve this problem. Deep learning algorithms require hundreds of training examples to outperform traditional methods in machine learning. As a result, as the amount of data increased, the model's training time increased significantly. Furthermore, when testing and recognizing, the model begins to become confused in the results because it is easy to fall into the abstraction trap during the learning process. The findings of this study show that a deep learning model can detect glaucoma from fundus images. Another issue is medical experts' acceptance of the deep learning approach. Deep learning algorithms are currently playing an import-ant role in the field of medical imaging. The goal of this study is to look into the role of CNN, a highly effective type of DL, in detecting glaucoma using fundus imaging. First, after resizing and normalizing the labeled images, they were fed into the CNN. The dataset was then divided into sections for training, validation, and testing of data results. To evaluate model performance, we used metrics such as accuracy, sensitivity, and precision.
More labeled data has the potential to improve model performance. The main point of this experiment is that all normal images from the test data were correctly classified. The developed algorithms detect glaucoma in fundus images with greater accuracy than practicing ophthalmologists. However, because CNN systems are designed for a specific task and ophthalmologists are trained to diagnose a wide range of eye diseases, the CNN framework cannot replace ophthalmologists in the diagnosis of fundus diseases. As an example, ophthalmologists could learn more about CNNs so that they can pay more attention to a larger number of patients. In terms of future research, we intend to concentrate on increasing sensitivity and accuracy by incorporating other learning technologies into the approach to achieve better classification results. Many studies have used fundus imaging to detect glaucoma. Another useful technology is optical coherence tomography (OCT), which can be used to assess glaucoma both qualitatively and quantitatively. As a future research project, we intend to investigate optical coherence tomography to detect and improve glaucoma symptoms.

Conclusions
1. Using the characteristics of the fundus, we created a model using Keras Conv2D, which helped us achieve good results in the training and testing process. The experiment showed an accuracy of 94 % and a loss value of only 0.27. To achieve these results, we increased the percentage of accuracy by increasing the filter size and training the model until we got more accurate data from the fundus images. During the training of the model, we also came to the conclusion that in the future it is possible to include a spectral image in the approach in order to improve the result of the model.
2. To determine the performance of the model, we tested benchmarking where the parameters were compared with another ResNet9 model. Based on this analysis, we can conclude that the U-Net CNN architecture we developed requires a lot of training time due to the number of layers and settings with different filters, but it gives better results in terms of segmentation and data accuracy.

Conflicts of Interest
The authors declare that they have no conflict of interest in relation to this research, whether financial, personal, authorship or otherwise, that could affect the research and its results presented in this paper.

Financing
The study was performed without financial support.

Data availability
Data cannot be made available for reasons disclosed in the data availability statement.