FCM-DNN: diagnosing coronary artery disease by deep accuracy Fuzzy C-Means clustering model

Cardiovascular disease is one of the most challenging diseases in middle-aged and older people, which causes high mortality. Coronary artery disease (CAD) is known as a common cardiovascular disease. A standard clinical tool for diagnosing CAD is angiography. The main challenges are dangerous side effects and high angiography costs. Today, the development of artificial intelligence-based methods is a valuable achievement for diagnosing disease. Hence, in this paper, artificial intelligence methods such as neural network (NN), deep neural network (DNN), and Fuzzy C-Means clustering combined with deep neural network (FCM-DNN) are developed for diagnosing CAD on a cardiac magnetic resonance imaging (CMRI) dataset. The original dataset is used in two different approaches. First, the labeled dataset is applied to the NN and DNN to create the NN and DNN models. Second, the labels are removed, and the unlabeled dataset is clustered via the FCM method, and then, the clustered dataset is fed to the DNN to create the FCM-DNN model. By utilizing the second clustering and modeling, the training process is improved, and consequently, the accuracy is increased. As a result, the proposed FCM-DNN model achieves the best performance with a 99.91% accuracy specifying 10 clusters, i.e., 5 clusters for healthy subjects and 5 clusters for sick subjects, through the 10-fold cross-validation technique compared to the NN and DNN models reaching the accuracies of 92.18% and 99.63%, respectively. To the best of our knowledge, no study has been conducted for CAD diagnosis on the CMRI dataset using artificial intelligence methods. The results confirm that the proposed FCM-DNN model can be helpful for scientific and research centers.


Introduction
Heart disease is an umbrella term that encompasses various diseases, including congenital diseases, CAD, and heart rheumatism. Based on the World Health Organization (WHO) report, CAD is the most common disease in middle-aged and older people, giving rise to killing more than 360,000 Americans in 2015 [16]. Moreover, according to the clinical centers for disease control and prevention statistics report, an American experiences a heart attack per 40 seconds ] 7 [ . Moreover, more than 75% of deaths have happened due to CAD in developing countries ] 1 [ . Regarding the mortality in men and women, more than 50% of the mortality has occurred caused by CAD in men, giving rise to 25% of deaths in the United States [8], and more than 630,000 Americans are dead per year [2], the cost of which has reached more than $ 200 billion [9]. In general, the costs of heart diseases for patients will double by 2030, according to the American Heart Association ] 10 [ . Angiography is the most common tool for CAD diagnosis that has side effects and high costs for patients ] 7 [ . In scientific centers, researchers use artificial intelligence methods to provide appropriate diagnostic models instead of angiography for CAD diagnosis [11,12]. The methods utilized by artificial intelligence researchers to diagnose CAD are machine learning and deep learning [11,13,14]. In recent years, deep learning (DL) methods have been used for the effective analysis of medical images [1520].
Accordingly, we propose methods such as neural network (NN), deep neural network (DNN), and fuzzy C-means clustering combined with DNN (FCM-DNN) for CAD diagnosis on cardiac magnetic resonance imaging (CMRI) dataset. The deep neural network (DNN) method is developed as an extended neural network (NN) method, which leads to higher detection accuracy, lower false rate, and lower deviation [21]. In this study, the image set contains labels with healthy and sick classes. To implement, both labeled and unlabeled data are considered for the training process. First, the labeled data is trained and tested using the NN and DNN methods so that the created NN and DNN models are evaluated under the criteria of accuracy, precision, sensitivity, specificity, F1-score, false positive rate, false negative rate, and area under the curve (AUC). Second, since the other model is a hybrid FCM-DNN model, the input data must be unlabeled. For this purpose, the data labels are removed, and the fuzzy C-means clustering method is applied to specify 10 clusters, 5 clusters for healthy subjects, and 5 clusters for sick subjects. Then, the clustered data is fed to the DNN. The created FCM-DNN model is also evaluated under the criteria mentioned above. As a final result, the proposed hybrid FCM-DNN method is a very accurate method with a maximum accuracy of 99.91% compared to the related methods used for CAD diagnosis.
In summary, the innovations of this paper are as follows:  Providing the CMRI dataset to diagnose CAD for the first time  According to the latest studies, using the developed FCM-DNN model to diagnose CAD by removing the labels of the data for the first time  Improving the DNN training and preventing the data over-fitting by performing operations such as selecting Maxout without the need for drop out, using the K-fold Cross-Validation (K-FCV) technique, and feeding the clustered data by the FCM to the DNN  Achieving a very high accuracy using the proposed hybrid model for diagnosing CAD on the CMRI dataset As the latest scientific achievement, the FCM-DNN model is performed for the first time using the CMRI analysis.
Currently, tools such as exercise stress testing (EST), chest x-ray, computed tomography scan, CMRI, coronary angiography, and ECG are used to diagnose the severity of heart disease in patients [22,23]. In recent years, more studies have been carried out in the field of CAD diagnosis based on ECG signals and numerical datasets using artificial intelligence methods.
In a study by Babaoglu et al. ] 24 [ , CAD diagnosis has been made using genetic algorithm (GA), binary particle swarm optimization (BPSO) algorithm, and support vector machine (SVM) algorithm on EST dataset. In addition, GA and BPSO algorithms have been applied as feature selection techniques. In their study, 408 patients have been tested through EST and coronary angiography. A total of 23 features have been extracted from the EST dataset. Using the BPSO algorithm, the diagnosis accuracy rate reaching 81.46% is the best compared to the GA and SVM algorithms achieving 79.17% and 76.67% accuracies, respectively.
Kumar et al. ] 25 [ have used ECG signals including 40 healthy subjects and 7 sick subjects for CAD diagnosis. The ECG signals have been mapped into pulses, which were mainly decomposed by analytical wavelet transform. The least-squares support vector machine with the radial basis function (RBF) kernel has been used for classification. As a result, the Violet kernel or Morlet wavelet kernel with the accuracy of 99.60% has provided higher accuracy than the RBF kernel, reaching the accuracy of 99.56% using the 10-fold cross-validation (10-FCV) technique. Alizadehsani  have utilized a feature engineering method for improving CAD diagnosis on the 500 samples. This method has exploited the results related to the NB, C4.5, and SVM classifiers for the non-invasive diagnosis of CAD disease. They have also used the weight by SVM method as a feature selection method. Based on the NB, C4.5, and SVM classifiers, the accuracy rates of 86%, 89.8% and 96.40% have been achieved, respectively.
In a study by Abdar et al. ] 33 [ , a hybrid two-level genetic algorithm and nuSVM, namely the N2Genetic-NuSVM method, has been proposed for CAD diagnosis on 303 samples. They have used a two-level genetic algorithm to optimize the SVM parameters and also have accomplished feature selection applying the GA algorithm. An accuracy of 93.08% has been achieved using the proposed method.
In the study conducted by Miao and Miao [34], a DNN model has been presented for CAD diagnosis on the Cleveland Clinic Foundation dataset with 303 patients. The proposed DL model includes 28 input units, first and second hidden layers, and a binary output unit, in which 105 neurons in the first layer and 42 neurons in the second layer have been considered, and 50% dropout has been assigned. The output unit has been connected to a sigmoid activation function in the final stage. An accuracy of 83.67% has been obtained using the proposed method. Hamersvelt  [ . In their study, an 11-layer CNN structure, including four convolutional layers, four max-pooling layers, and three fully-connected layers, has been developed. Moreover, an overall of 95,300 segmented ECG signals have been utilized for the first network (2 seconds), and a total of 38,120 segmented ECG signals have been used for the second network (5 seconds). The proposed CNN model has achieved the accuracies of 94.95% and 95.11% for the first and second networks, respectively.
Tan et al.
have introduced a long short-term memory (LSTM) neural network model combined with a CNN model for CAD diagnosis based on ECG signals on the PhysioNet database. Accordingly, an 8-layer stacked convolutional LSTM network has been designed, in which layers 1 to 4 consist of two convolutional layers and two layers of max pooling for the CNN structure, layers 5 to 7 relate to the LSTM layers, and the last layer is a fully connected layer as the classification layer. The proposed method has achieved an accuracy of 99.85%.
have investigated the K-nearest neighbor (KNN) classifier to classify and diagnose CAD on ECG signals. To extract features, they have used methods such as discrete cosine transform, discrete wavelet transform, and empirical signal decomposition into intrinsic state components. Besides, these methods have been compared in the disease diagnosis process. The ECG signals have also been applied to the appropriate transformation methods to obtain coefficients. Then, the features have been reduced using the locality preserving projection method, and the reduced features have been ranked applying the analysis of variance technique. In the following, high-ranking features have been fed to the KNN classifier. As a result, the proposed model provided the best performance reaching the accuracy of 98.5% via the discrete cosine transform method using only seven features.
Acharya et al. [40] have presented a CNN method to diagnose CAD on ECG signals. The dataset includes 30,000 patients and 110,000 healthy persons. As a result, the proposed method leads to an accuracy of 98.97% using the 10-FCV technique.
In a study by Ghiasi  with 303 patients and 55 features to diagnose CAD. They have compared their model with classification models such as SMO, bagging, bagging with SMO, NB, artificial neural network, and J48 and C4.5 decision trees. The accuracy rate of 100% has been gained using the CART model for CAD diagnosis.
To identify the risk factors for CAD, Verma et al.
have implemented a combined model of correlation-based feature subset (CFS) selection with particle swarm optimization (PSO) and K-Means clustering algorithms on 335 samples with 26 features. After applying CFS and PSO, five features have been identified as risk factors. In addition, multi-layer perceptron (MLP), multinomial logistic regression (MLR), fuzzy unordered rule induction algorithm, and C4.5 decision tree have been implemented for CAD diagnosis. As a result, the highest accuracy of 88.4% has been obtained using the MLR algorithm.
Idris et al. [43] have developed data mining models, including NN, logistic regression (LR), KNN, NB, SVM, deep learning, and Vote (an ensemble method with NB and LR) on the Malaysian National Cardiovascular Disease-Acute Coronary Syndrome datasets from the University of Malaya medical centre (UMMC) and Sultanah Aminah hospital (SAH) to predict the CAD. Feature selection methods such as the Chi-squared test, recursive feature elimination, and embedded decision tree have been applied. The prediction accuracy rates of 94.5% and 89.7% have been obtained through the NN method combined with the embedded decision tree method on the UMMC and SAH datasets, respectively.
Velusamy and Ramasamy ] 44 [ have examined three classification methods, including SVM, random forest, and KNN, for CAD diagnosis on the Z-Alizadeh Sani dataset. The results of the classifiers have been combined based on the weighted-average voting, majority-voting, and averagevoting methods. The weighted-average voting method and five selected features lead to better performance with an accuracy rate of 98.97% compared to other classifiers on the original Z-Alizadeh Sani dataset. In addition, the proposed algorithm reaches the accuracy of 100% on the Z-Alizadeh Sani balanced dataset.
According to the previous works, researchers have investigated three types of datasets, including numerical, CT scan, and ECG signal datasets for CAD diagnosis. In this paper, we have utilized the MRI dataset to diagnose CAD for the first time. The strength of this research is the use of the CMRI dataset in two labeled and unlabeled forms, with the NN and DNN methods applied to the labeled data and the FCM-DNN method applied to the unlabeled data. Moreover, in the previous works, the important accuracy evaluation criterion has been calculated on labeled data, while in our paper, a great accuracy rate of 99.91% has been obtained based on the FCM method in combination with the DNN classifier on the unlabeled data.
The rest of the paper is structured as follows: The proposed methodology is introduced in Section 2. The evaluation of models, experimental results, and research findings are expressed in Section 3. Comparison with the previous researches and discussion are presented in Section 4. Finally, the conclusion and future work are given in Section 5. In this paper, NN, DNN and fuzzy C-means clustering combined with deep neural network (FCM-DNN) methods are used on the CMRI dataset to classify the images and diagnose the CAD. The proposed methodology is implemented in 6 phases, including collecting clinical image sets related to CMRI dataset for healthy and sick subjects, data preprocessing, CMRI dataset partitioning, classification models, evaluation criteria of the models, experimental results, and their interpretation for classification of the CMRI images and diagnosis of the CAD. The proposed methodology is shown in Figure 1.

Phase 1: collecting clinical CMRI dataset
The first phase is the extraction of CAD clinical image sets related to CMRI. This dataset is provided from Milad hospital in Tehran, Iran, by Z. Alizadeh Sani. The dataset utilized in this paper includes 4965 images so that 2569 images of which are related to 16 healthy subjects, and the remaining 2396 images are associated with 14 sick subjects. All the images are grayscale, and their dimensions vary for healthy and sick subjects. For example, four images of healthy and sick subjects are illustrated in Figures 2 and 3, respectively.  In addition, the statistical characteristics of our dataset for healthy and sick subjects are stated in Tables 1 and 2, respectively.

Phase 2: data preprocessing
In the samples analysis process, preprocessing the samples is required. The images of healthy and sick subjects in the dataset differ in size, thus their size is changed into a 100  100 dimension. Furthermore, one of the available approaches for preprocessing image samples is data normalization between 0 and 1. Normalization increases the accuracy of clustering and classification models and also reduces the false rate of clustering. The type of normalization method is determined as interval transformation, i.e., the sample set is normalized between 0 and 1. Indeed, by normalizing the images, the light intensity of the images is in the interval of 0 and 1.

Phase 3: image set partitioning
In this paper, the K-FCV technique is applied for the partitioning phase of the CMRI dataset, i.e., the data is divided based on the 10-FCV, 7-FCV, and 5-FCV techniques. Utilizing the K-FCV technique, the images are divided into K parts so that K-1 parts are used for training and 1 part for testing. By rotating the test image set, the K-FCV process is repeated K times. The advantages of the K-FCV technique are that this technique prevents data over-fitting, improves training, and reduces loss. Moreover, applying the K-FCV technique leads to more training data points to develop the expected model. Therefore, the dataset is partitioned based on the 10-FCV, 7-FCV, and 5-FCV techniques. The partitioning process for training, testing, and validating the CMRI dataset through the 10-FCV, 7-FCV, and 5-FCV techniques is shown in Figures 4, 5 and 6, respectively.   According to Figures 46, the 10-FCV, 7-FCV and 5-FCV techniques are utilized for training the models, respectively. 0.9 of the data is used for training, and the remaining 0.2 is utilized for testing via the 10-FCV technique. Six-seventh of the data is applied for training, and the remaining oneseventh is used for testing through the 7-FVC technique. Four-fifth of the data is utilized for training, and the remaining one-fifth is applied for testing by the 5-FCV technique. In the next step, considering 0.8 of the training data for training and 0.2 of the training data for validation, the partitioning process is accomplished 10, 7 and 5 times for 10-FVC, 7-FCV and 5-FCV techniques, respectively.

Phase 4: classification models
The most common system for diagnosing CAD is angiography. This system has many side effects and high costs for patients. On the other hand, CAD can lead to myocardial infarction if the patient's condition is not correctly diagnosed during testing and also is not treated on time. Therefore, it is essential to use intelligent automated decision-making systems and technologies for CAD diagnosis. In recent years, researchers have tried to use artificial intelligence techniques as an alternative to angiography for the early diagnosis of CAD. Hence, in this paper, NN, DNN, and FCM-DNN classification methods are proposed to be applied to the CMRI dataset. The creation of the NN, DNN and FCM-DNN models is described in detail below.

Creating NN model
The structure of the NN is derived from the structure of the human neural network in the biological brain. In the human neural network, there are a series of functional units called cells and neurons. In neurons, the data is in the form of pulses that enter and exit the cell so that as the pulse passes through the cell, a series of processes take place in the cell nucleus. This process is learned all over human life, and the so-called neural network structure is trained throughout human life. In general, the standard NN is one of the classification methods in which the created model is identified as a set of interconnected nodes with their weighted connections. This created model includes the input layer, hidden layer, and output layer. The process of generating output is such that each of the input dimensions is multiplied by a weight factor. Then, the sum of the multiplications of these weights passes through a nonlinear function, which eventually produces a new output. In other words, in this neural network, there is a layer called the feature layer or hidden layer, the output of which is the feature space, and the input to the last layer, i.e., the classifier layer, which determines the class of the input data.
Hence, in this paper, the created NN model is a 4-layer model. The first layer is related to the input images. The second and third layers are specified as the hidden layers with three neurons, and the last layer is determined as output class (sick/healthy). The parameters settings for the NN model are described in Table 3, and the NN model is presented in Figure 7.

Creating DNN model
The standard NN model is such that it tends to have a high error deviation, which can lead to adverse effects on the classification performance. To address this problem, the DNN model, as an extended NN model, improves the classification performance by increasing the number of hidden layers from 3 to 100 and more. Another strength of the DNN model is that in this model, the data is transferred from one hidden layer to another so that simpler features are recombined and recomposed into complex features to generate the desired output. The advantages of the deep learning model are expressed below: 1) Automated feature learning: The DNN model automatically extracts appropriate features from the data and is so-called trained.
2) Multi-layer feature learning: Based on the DNN model, there is the ability to simultaneously access features at different levels in a hierarchical manner, from low-level features to complex level features.
3) High accuracy of deep neural network diagnosis: The accuracy of the DNN model in the output is higher than the accuracy of the NN model. 4) High generalization power of the network: High generalization power means that in addition to the data trained by the DNN if new data similar to the training data is fed to the network, the highly developed DNN model can diagnose the new data as well.
In the DNN model, similar to the NN model, the images are applied to the input layer, and the class of the input images is specified in the output layer. Therefore, in this paper, the created DNN model is an 8-layer model including one input layer for the images, six hidden layers, and one output layer. The Maxout [45,46] is selected as a nonlinear activation function, which assigns the activity of the neurons to the hidden layers of the network. Indeed, the Maxout determines the utmost coordinate of the network input vector, which is effective for over-fitting of the input data, reducing the complexity, and improving the deep network training.
The Maxout function is defined as follows for two classes: According to Eq (1), , , and represent the random initial weight, the elements entries of the weights, and the feature vector of the input images for the sick and healthy classes, respectively.
The parameters settings for the DNN model are described in Table 4, and the DNN model is presented in Figure 8. To classify the healthy and sick subjects by determining the value of 1 for the healthy subject and the value of 0 for the sick subject, a sigmoid function [47,48] is assigned to the last layer. Moreover, a cross-entropy (CE) function ] 49 [ is defined as the loss function. The formulas of these functions are defined as follows: In Eq (2), the output value of the decision boundary (Si) or the probability value of the predicted class is 0 or 1, xi is the input image, and w is the weight. In Eq (3), C represents the number of classes, and yi indicates the predicted value of the desired class. Since, in this paper, the number of classes is two, the CE function is calculated as below: According to Eq (4), F(S2) is equal to 1-F(S1). Also, The pseudo-code of the DNN model is presented below.   (2)) for classifying Sick/Healthy subjects in the output layer 11. Calculate loss function through Eq (4) 12.

13.
Return Obtain the evaluation criteria and diagnose the Sick/Healthy classes for input images 14. End

Creating FCM-DNN model
Clustering is a standard descriptive method identifying a finite set of categories/clusters for describing similar data. In other words, clustering is the grouping of samples with similar characteristics. The samples of one group have the most similarity to each other and the most difference from the samples of other groups. Each cluster has a center that the degree of the similarity of the data to the center of the cluster is generally determined by a parameter called the similarity criterion/distance criterion ] 50 [ . Indeed, in clustering, the similarity criterion is determined based on maximizing the separation between clusters.
Therefore, in clustering, the categories are not predefined, and the data grouping operation is done without supervising or labeling, i.e., the training data do not have a label. The suitable performance of a clustering method is such that the samples of different clusters have the least similarity. A standard clustering model is shown in Figure 9. In general, in all the clustering methods, the goal is to minimize the intera-cluster distance and maximize the inter-cluster distance ] 51 [ . Standard clustering algorithms in vector quantification are K-Means [52,53], K-Medoids [54], and FCM. In this paper, the aim is to create the model in two ways. First, the labeled dataset is fed to the NN and DNN to create the NN and DNN models of the data. Second, the labels are removed from the initial dataset, and the unlabeled data is clustered using the FCM method, and then, the clustered data is fed to the DNN to create the hybrid FCM-DNN model. Here, the utilized FCM clustering method is explained in detail.
The FCM method was first proposed in 1973 by Duda and Hart ] 55 [ , which performs a more accurate clustering than classic clustering methods, such as K-Means and K-Medoids, under uncertain conditions. Unlike classical clustering methods, fuzzy clustering methods are appropriate for allocating data to more than one cluster.
A fuzzy version of the C-Means clustering method has been proposed by Don to solve the problem of allocating images to more than one cluster ] 56 [ . Later, the FCM method was developed by Bezdek ] 57 [ , in which a fuzzy factor of m has been defined as a fuzzifier.
The main idea of the FCM clustering is that a sample can belong to more than one cluster with a membership degree between 0 and 1 based on the membership function/objective function [5861].
In the FCM method, the membership function is as follows: where the variable m is a real number larger than 1, which is assigned to be 2 in most cases. In the given formula, if the variable m is set equal to 1, the objective function of C-Means clustering will be obtained. Moreover, in the stated formula, the variable Xk is the sample K, Vi is the cluster center, the number of clusters "C" is predetermined, and n represents the number of samples. Uik indicates the degree of belonging the sample i to the cluster k.
The distance between the sample Xk from the cluster center Vi is computed as follows: The most crucial similarity criterion for solving clustering problems is the distance criterion "d", which must be minimized. In other words, the FCM method determines the data for each cluster based on the distance between the cluster center and the data points by assigning membership to each data based on the membership function.
In summary, the FCM method includes the following steps:  C cluster centers are randomly assigned.  The distance of each sample from the center of the cluster is obtained as: where d represents the distance between each sample from the cluster centers mi and mj, and Uimi indicates the degree of belonging to each sample.
 New centers of the clusters are obtained using fuzzy means. If we have two clusters, the new centers of the clusters are achieved as follows: where Xi represents the sample i, and m1 and m2 are the cluster centers.  Finally, the fuzzy intra-cluster-based sum of the distances is calculated under the following membership function "J", which must be optimized: Based on Eq (8), J is the sum of the distances, "C" is the number of clusters, "Uij" is the degree of belonging the sample i to the cluster j, and "dij" is the distance of sample i from the center j. For two consecutive iterations, if the sum of the distances is less than the threshold value, the FCM method will terminate. In this situation, new cluster centers will be determined.
The parameters settings for the FCM method are presented in Table 5. Therefore, the advantage of the proposed FCM method is that this method is always convergent and always has a rapid convergence rate in reaching the final solution, i.e., the FCM method converges to a local optimum.
Despite the advantages of the FCM method, the disadvantage of this method compared to the classic clustering methods is its more computational time due to additional calculations for allocating each data to all the clusters. However, the crucial advantage of data clustering using the FCM method is achieving higher accuracy.
In this paper, the FCM-DNN method is examined on the CMRI dataset. Firstly, the dataset is clustered for identifying the clusters by the FCM method. The number of the clusters is assigned as 10. It should be noted that the images were initially labeled as healthy and sick subjects. The labels have been removed for clustering. Then, 10 clusters are determined for clustering operations; 5 clusters for healthy subjects and 5 clusters for sick subjects.
After applying fuzzy clustering, the generated dataset is fed to the DNN model for classifying the CMRI dataset. The developed FCM-DNN model diagnoses the input image between 10 clusters, including healthy and sick classes. The pseudo-code of the FCM-DNN model is presented below. Image set preprocessing: Change the image size to 100  100 and normalize the data 3.
While (The termination condition is not fulfilled by 10-FCV, 7-FCV, and 5-FCV techniques or the iterations are less than 50 based on Table 5) do 5.
Compute the distance of each sample to C cluster centers and determine the degree of belonging to each sample based on Eqs (5) to (7) 7.
Obtain new centers for the clusters using fuzzy means based on Eq (8) 8.
Calculate the fuzzy intra-cluster-based sum of the distances based on Eq (9) 9.
Generate the dataset based on 10 clusters (5 clusters for healthy subjects and 5 clusters for sick subjects) 10 Apply DNN training for each image according to the generated dataset 11 Create FCM-DNN model 12 Apply FCM-DNN validation for each image 13 Apply FCM-DNN model classification for testing input images 14 Assign sigmoid function (Eq (2)) for classifying Sick/Healthy subjects in the output layer 15 Compute loss function 16 End while 17 Return Obtain the evaluation criteria and diagnose the Sick/Healthy classes for input images 18 End

Evaluation and experimental results
In this section, we have evaluated the models based on the fifth phase of the proposed methodology. The evaluation criteria of the models, including accuracy (ACC), precision or positive predicted value (PPV), sensitivity (SEN), specificity (SPC), F1-score, false positive rate (FPR), false negative rate (FNR), and area under the curve (AUC), are measured using a confusion matrix (CM) [62]. The CM includes true positive (TP), false positive (FP), true negative (TN), and false negative (FN) elements. The CM utilized in this paper is described in Table 6.  . According to the sixth phase of the proposed methodology, the experimental results of the models are illustrated in Table 7 in terms of the evaluation criteria and the number of folds. The experimental environment includes Intel(R) Core(TM) i5-4200U CPU @ 1.60 GHz to 2.30 GHz, 6 GB of RAM, Windows 10 operating system, x64-based processor, and NVIDIA GeForce840M, and the methods are implemented using the RapidMiner software version 9.5.0 1 [46].
According to Table 7, the ACC, PPV, SEN, SPC, F1-score, FPR, FNR and AUC rates are obtained using the NN, DNN and FCM-DNN methods on the CMRI dataset.
The most crucial criterion for diseases diagnosis is ACC. The ACC rate for CAD diagnosis using the proposed FCM-DNN method is more than the NN and DNN methods.
The ACC of the FCM-DNN method is obtained as 99.91% on 4965 images using the 10-FCV technique, while the accuracy of the DNN and NN methods is achieved as 99.63% and 92.18%, respectively. In addition, the FPR and FNR criteria are essential for determining the false rate of diagnosing the disease for clinical centers so that the FPR is more valuable than the FNR for identifying more risks. The FPR value is achieved as zero, while the FNR value is gained as 0.18 using the proposed FCM-DNN method. Furthermore, utilizing the DNN method, the value of the FPR is gained as 0.42, and the FNR value is obtained as 0.32. Moreover, applying the NN method, the FPR value is calculated as 12.21, while the FNR value is obtained as 3.56. As a result, the FCM-DNN method has a lower false rate than the NN and DNN classification methods.
As a significant result, there is a crucial criterion for evaluating the classification models, namely AUC, which indicates the accuracy of the level below the receiver operating characteristic (ROC) curve. Based on the 5-FCV, 7-FCV and 10-FCV techniques, the ROC diagram for the NN, DNN and FCM-DNN models are shown in Figures 10(a)(c), 11(a),(b) and 12, respectively.    Figures 10(b),(c), respectively. The AUC value for the DNN method based on the 10-FCV and 5-FCV techniques is gained as 99.9%, as illustrated in Figure 11. The AUC value for the DNN method based on the 7-FCV technique is computed as 100%, as shown in Figure 11(b). Ultimately, according Figure 12, the AUC value for the FCM-DNN method through the 10-FCV, 7-FCV and 5-FCV techniques is obtained as 100%.
As a result, the FCM-DNN model has the best AUC value compared to the NN and DNN models using the 10-FCV, 7-FCV and 5-FCV techniques.

Discussion
Recent advances in artificial intelligence methods using image processing for CAD diagnosis have attracted more researchers to the subject. Automatic diagnosis of CAD among sick and healthy images can be a crucial step for medical exegesis utilizing artificial intelligence methods. Deep learning method is the most common method for image processing. In this paper, the 8-layer deep learning model combined with fuzzy C-means clustering has been used for CAD diagnosis. Meanwhile, neural network and deep neural network methods have been implemented and evaluated. In fact, three methods were employed on the CMRI dataset for the first time. Moreover, 10-fold cross-validation, 7fold cross-validation, and 5-fold cross-validation techniques have been utilized to evaluate the models. The experimental results have demonstrated that the proposed deep learning model improves the automatic diagnosis of CAD in terms of accuracy, precision, sensitivity, specificity, F1-score, false positive rate, false negative rate, and AUC value.
The performance of the proposed models is compared based on various criteria in Figure 13. According to Figure 13, the FCM-DNN method has the best performance compared to the NN and DNN methods in terms of the evaluation criteria. Therefore, diagnosis of CAD is guaranteed using the FCM-DNN method. Previous researches as well as the present study are compared in Table 8 regarding the accuracy of CAD diagnosis, methods, and datasets.   [40] 4965 CMRI FCM-DNN 99.91 In this paper According to Table 8, previous studies have been carried out on three types of datasets, including numerical data, ECG signal data, and CT angiographic data to diagnose CAD. For the first time, we have applied the AI methods on the CMRI image set, and among the NN, DNN, and FCM-DNN methods, we have achieved the highest accuracy of 99.91% for CAD diagnosis on 4965 images using the hybrid FCM-DNN method.

Conclusions and future work
Coronary artery disease, also known as coronary artery stenosis, is the most common disease in middle-aged and older people. Heart disease [65] is occurred by the accumulation of platelets in the arteries. Following this event, blood flow is clogged, leading to heart failure. The most popular tool for diagnosing CAD disease is angiography, which has side effects and high costs [66].
In recent years, many studies have been conducted to develop artificial intelligence-based methods and replace them with angiography. Hence, in this paper, the NN, DNN, and FCM-DNN methods were applied for CAD diagnosis on the CMRI dataset. The main purpose was to analyze the CMRI dataset in two different approaches using the standard NN, DNN, and FCM-DNN methods. In the first approach, the labeled dataset was applied for the NN and DNN modeling, while in the second approach, the unlabeled dataset was clustered and used for the FCM-DNN modeling.
The results demonstrated that the proposed FCM-DNN method has the best accuracy rate of 99.91% and the least false rate compared to the NN and DNN methods. As a significant achievement, no studies have been carried out for CAD diagnosis on the CMRI dataset so far. As future work, we will study convolutional neural network and auto-encoder neural network algorithms on the CMRI dataset to diagnose CAD.