Diagnosis of Alzheimer’s Disease Using Convolutional Neural Network With Select Slices by Landmark on Hippocampus in MRI Images

Alzheimer’s disease (AD) is a major public health priority. Hippocampus is one of the most affected areas of the brain and is easily accessible as a biomarker using MRI images in machine learning for diagnosing AD. In machine learning, using entire MRI image slices showed lower accuracy for AD classification. We present the select slices method by landmarks on the hippocampus region in MRI images. This study aims to see which views of MRI images have higher accuracy for AD classification. Then, to get the value of three views and categories, we used multiclass classification with the publicly available Alzheimer’s Disease Neuroimaging Initiative (ADNI) dataset using Resnet50 and LeNet. The models were used in a total dataset of 4,500 MRI slices in three views and categories. Our study demonstrated that the selecting slices performed better than using entire slices in MRI images for AD classification. Our method improves the accuracy of machine learning, and the coronal view showed higher accuracy. This method played a significant role in improving the accuracy of machine learning performance. The results for the coronal view were similar to the medical experts usually used to diagnose AD. We also found that LeNet models became the potential model for AD classification.


I. INTRODUCTION
Alzheimer's disease (AD) is a major public health priority [1]. Globally, around 44

million people have been diagnosed
The associate editor coordinating the review of this manuscript and approving it for publication was Jerry Chun-Wei Lin .
with AD globally, which may reach 131.5 million people by 2050 [2]. AD is a progressive, neurodegenerative disease that affects elderly people over 65 years old and impacts memory and cognitive function [3]. Although there is no known cure for AD, some medications and treatments can temporarily relieve symptoms or slow down AD progression [4]. Research studies have demonstrated that early AD diagnosis can make a living with the disease easier [3].
In an early AD diagnosis, observing and exploring the deterioration process in the brain regions is important before the progression of the disease. Hippocampus is one of the most affected areas of the brain and is easily accessible as a biomarker of AD [5], [6]. For example, the degeneration of cholinergic circuits in the hippocampus and reduced volume changes in the hippocampus are related to memory loss [5].
In addition, a severe volume reduction of the hippocampus can be easily detected using Magnetic Resonance Imaging (MRI) images and is widely used for diagnosing AD [7]. MRI has become an excellent and valuable tool with a highly effective imagery technique for diagnosing and analyzing structural changes in the brain [8]. Moreover, structural MRI image biomarkers are used in three categories: AD, Mild Cognitive Impairment (MCI), and Normal Control (NC) [9] Three categories of AD are used to understand subtle changes in disease progression on an MRI image in the early stages of AD [10].
Recently, machine learning models have been developed to diagnose AD based on MRI images [11]. The advances in machine learning have the potential to classify complex patterns from MRI images, and the diagnosis can be finalized in a brief time [12]. For example, a study by Kazemi and Houghten used machine learning models to classify different categories of AD [13]. Other studies showed that cancer detection accuracy is comparable to manual detection. Therefore, machine learning can reduce the time and is expected to perform consistently in large amounts of data at any time. In contrast, manual diagnosis results may be affected the time to read the MRI images in diagnosing AD. Furthermore, with the advantages, machine learning has become the preferred method for medical image classification [14]. In terms of classification tasks in different AD stages, first, use binary classification between two categories to classify AD, such as AD vs. NC, MCI vs. AD, and NC vs. MCI [15]. In comparison, some studies used the multiclass classification to classify AD in three categories [16]. In addition, a study by Kazemi and Houghten used machine learning to classify AD in three categories (i.e., AD, MCI, and NC) [13]. The use of multiclass classification can be beneficial in distinguishing the results of three categories of AD because the binary classification consists of only two categories, whereas the multiclass classification consists of more than two categories. Thus, the multiclass classification can help improve the clinical decisions on whether someone will develop the disease through the results of each category to diagnose AD.
In order to improve performance for AD classification, several studies have proposed different methods to improve the accuracy performance in machine learning using magnetic resonance imaging (MRI) images. For instance, the improvement of the MRI image quality to reduce the noises [17], the use of segmentation in a specific brain region [18], classification techniques with AdaBoost [19], and slice-based method [20]. In general, the entire slices of MRI images are composed of 100 to 250 slices [21], [22]. However, Lopez et al. found that the results in classification improve the accuracy with several slices than using the entire slices [23]. In spite of accuracy improvement for AD classification, there is no detailed information on how they select several slices in MRI images within three views and three categories. For example, in a study by Kang et al., they selected 11 slices based on the highest classification accuracy in coronal view between 1 to 145 slices [24]. Thus, there is no information about selecting several slices regarding the early diagnosis of AD in the hippocampus using MRI images.
Furthermore, getting the information on the biomarker for early AD diagnosis (i.e., hippocampus) in MRI images to select several slices requires medical experts' knowledge and experiences [25], [26]. The medical experts' information can be used as the ground truth in three views and three categories for AD classification. Still, the three views of MRI images are commonly known as axial, coronal, and sagittal [27]. Using three views of MRI images may offer complementary features useful for AD classification. Even though MRI images can provide the landmark of the hippocampus region in three views, medical experts usually require one view, which will be beneficial for diagnosing AD. Thus, the selecting slices method in MRI images can be used for AD diagnosis and provide more computational simplicity than using entire slices in three views and three categories (i.e., AD, MCI, and NC).
Finally, the selecting slices method focuses on the landmarks in the hippocampus region in MRI images and is used for AD classification. In addition, working on the hippocampus region on MRI images can involve more advantages to improve the performances of AD classification. Therefore, we hypothesized that the selecting slices method of MRI images using landmarks in the hippocampus region might improve performance in classifying AD. In order to validate our proposed method, we compare the classification result with the use of entire slices in MRI images. With the proposed method, this study aims to see which views (i.e., axial, coronal, and sagittal) of MRI images are higher accuracy for AD classification in machine learning. Then, we used multiclass classification on MRI images to get the result in three categories (i.e., AD, MCI, and NC) for AD classification.

A. DATASET ACQUISITION
Data used in this study were obtained from the publicly available Alzheimer's Disease Neuroimaging Initiative (ADNI) database (https://adni.loni.usc.edu/) in the first phase (ADNI1). Following the previous studies, ADNI1 is one of the most commonly used databases to diagnose AD [28], [29]. We used the baseline ADNI1 dataset from a 1.5T Tesla scanner, pre-processed with Magnetization Prepared Rapid Gradient Echo (MP-RAGE) with a resolution of 256 × 256 × 170 voxels. Further detailed information can be found on the ADNI website. The dataset includes 300 subjects in three categories, including 100 AD, 100 MCI, and 100 NC. It is similar to other studies that used more than 100 subjects to diagnose AD using machine learning [30], [31]. However, considering the small dataset can reduce the parameters and computation cost, which still performs better for AD diagnosis [32].

B. METHOD PROCEDURE
We conducted the four steps to classify AD in three views. The first step is collecting the data from the ADNI database; the second step is extracting the images; the third step is selecting slices; and the fourth step is using a machine learning algorithm and multiclass classification. The flowchart for classification AD is shown in Figure 1.

C. SELECTING SLICES
The selecting slices method is conducted with the medical experts from China Medical University Hospital, senior experience in the Neurology field. The process of selecting slices consists of four steps. In the first step, medical experts check the MRI images in three views, three categories, to identify the biomarker in the brain region that is affected by AD. In similar studies by Dickerson et al., medical experts check the brain region before making any decisions [34]. Medical experts consider the hippocampus region as was identified to become one of the brain regions affected by AD. Similarly, Rao et al. used the hippocampus region to diagnose AD [5]. In the second step, selecting slices method in MRI images using landmarks in the hippocampus region in three views. In the axial, a sign from the midbrain inferior ('mickey mouse sign') to the upper part ('apple sign'), from the coronal in the anterior brainstem part to the whole brainstem could be seen, and from the sagittal in the ventricles lateral. Then, we marked MRI slices in three views and categories. In the third step, the medical expert confirmed the dataset to ensure the landmark in the hippocampus region of the MRI image slice was correct. The process for selecting slices method of MRI images using the landmarks on the hippocampus region in three views and three categories is shown in Figure  2. In the fourth step, we recruited 300 subjects from the ADNI database. There were three categories (AD, MCI, and NC), each containing 100 subjects. The MRI images category was extracted into three views (axial, coronal, and sagittal). There are approximately 16,500 images in each view of MRI images. Therefore, each subject of MRI slices has 160 to 170 slices. Thus, the number of entire slices for our subjects is 149,805 MRI slices in three views and three categories.
Moreover, five slices were selected from 160 to 170 MRI slices for each view and category. Accordingly, there are 500 MRI slices for each view of 100 subjects. Finally, the dataset was divided 80:20, with 80% (3600 MRI slices) of the data used for training and 20% (900 MRI slices) used for validation. Therefore, the detailed information of our balanced dataset in each view and category for training and validation is summarized in Table 1. Then, we applied the dataset in machine learning to improve performance for AD classification. The process of selecting 5 slices from entire slices on the MRI images is shown in Figure 3.

D. CONVOLUTIONAL NEURAL NETWORK
Recent advances in machine learning, such as convolutional neural networks (CNN), have achieved better classification results for AD classification [35] Two models, Resnet50 and LeNet, were used for AD classification. These two models are widely used for classifying AD [36], [37].

1) ResNet50
ResNet was proposed by Chinese scientists from the former Microsoft Research Institute, and deep Resnet is a milestone event in the history of CNN images [38]. ResNet50 is one of the deep learning models proven to be very efficient in AD classification [39]. Resnet50 architecture contains 50 layers with four stages. Increasing convolutional layers with residual blocks stacked in each stage do not reduce the model's performance. As shown in Figure 4, the 48 convolutional layers are arranged into four residual stages along with 1 Max Pooling and 1 Average Pooling layer. The activation function is called ReLU after the convolutional layer, which passes the positive outputs, suppresses the negative outputs in the feature map, and turns them to zero. Two pooling layers, one of average type and the other of max type, are used to reduce the dimensionality of the inner feature map. The fully connected layer receives the feature map as a flat vector and passes it to the SoftMax activation function, resulting in three classes for classifying AD. Moreover, several studies have used pretrained models to classify AD [40].
Generally, the network structure between Resnet50 and Pretrained Resnet50 is the same. The difference is during the training process. The model learns the feature representation of the training data from scratch. In this case, the model does not build on the knowledge previously learned on a massive dataset. For the purpose of training Resnet50 without pretrained weights, all parameters or weights in the network are randomly initialized. The Resnet50 architecture is shown in Figure 4.

2) LeNet
The LeNet model was first designed in the study by LeCun et al. [41] LeNet is one of the oldest and simplest models. The LeNet is the most efficient model since it consumes a smaller amount of computation time and is effectively used in various image classifications [42]. Likewise, LeNet performs best for classification in medical images [43]. For instance, classify AD and achieve more than 92% accuracy [44]. However, the LeNet model is suitable for classification tasks. Thus, this study uses the LeNet model for AD classification, and the architecture is shown in Figure 5. The process for selecting slices method of MRI images using the landmarks on the hippocampus region in three views and three categories; (A) the top is imaging signs by marking the hippocampus from the axial midbrain inferior part ('mickey mouse sign') to the upper part ('apple sign'), below is the hippocampus region landmark; (B) top is an imaging sign from the brainstem anterior part to the whole brainstem could be seen, below is the hippocampus region; (C) top is an imaging sign from sagittal view in the ventricles lateral, below is hippocampus region.

III. RESULT
All experiments were used in the same dataset. Two models are trained for multiclass classification (i.e., AD, MCI, and NC) in three views (i.e., axial, coronal, and sagittal) and two models (e.g., Resnet50 and LeNet). The result of our studies is listed in the section below.

A. EFFECT OF SELECT SLICES
The proposed selecting slices method accuracy is shown in Table 2. Then, the result was compared using entire slices on MRI images and the selecting slices method to evaluate the effectiveness and performance in three views and three categories. According to Table 2, the selecting slices method showed higher accuracy in the range of 0.84 to 0.98 among the two models than using the entire slices on MRI images. The result for entire slices showed lower accuracy in the range of 0.37 to 0.65 among the two models. The accuracy  comparison between the entire slices and the selecting slices method is shown in Table 2. The accuracy graph performance in AD classification is shown in Figure 6.

B. COMPARISON OF THREE VIEWS
In addition, comparisons between the three views were used to see which view shows the highest accuracy by selecting slices by landmark on the hippocampus region. As shown in Figure 6(B), the coronal view showed higher accuracy among the two models. The accuracy shown in the coronal view is (0.97), (0.95), and (0.98), respectively. While axial   Table 1, the average accuracy in the coronal view is also higher; the accuracy is 0.96.

C. COMPARISON OF TWO MODELS
The two models were compared to see which model performed better in AD classification. According to Table 2, it showed LeNet has the best comprehensive performance of the three views. LeNet was the model that exhibited higher accuracy gains among the three views than the other two models. The accuracy of the LeNet models was 0.98 in the axial, 0.98 in the coronal, and 0.97 in the sagittal view.

D. PERFORMANCE OF MULTICLASS CLASSIFICATION
The performance result of multiclass classification was used to get the value in three categories (i.e., AD, MCI, and NC). According to Table 3, the result of a multiclass classification for AD, MCI, and NC in coronal and sagittal achieve better precision, recall, and F1 score in Pretrained Resnet 50 and LeNet models in a range of 0.90 to 1.00. In contrast, MCI in the axial view showed lower in a range of 0.81 to 0.88 for Resnet50.

IV. DISCUSSION
This research demonstrated that the selecting slices method could improve the performance in machine learning on AD classification using MRI. The finding may provide evidence supporting our hypothesis that the proposed selecting slices method is better for AD classification than the entire slices in MRI images. Furthermore, the selecting slices method  showed higher accuracy in the coronal view than in the axial and sagittal views. However, we found the LeNet model showed the best performance than Resnet50. Finally, from the multiclass classification result, we can see the value of each category in three views.
According to the results in Table 2, we found that the selecting slices method using landmarks on the hippocampus region showed higher accuracy than the use of entire slices in MRI images. Similar efforts in the previous study, select slices on a specific region in MRI images can increase the performance for AD classification [45], [46]. In addition, we did not use any other computational process on MRI images, and we still achieved higher accuracy with the selecting slices method. However, we may assume that the selecting slices method may be affected the AD classification performance in machine learning. Furthermore, our work for selecting slices method might be useful as a guideline to get the informative slices from entire slices on MRI images.
In order to verify the performance of our proposed method from the other works of literature, see Table 4. The study by Valliani and Soni used pretrained Resnet with augmentation in axial view and showed an accuracy of 0.56 [47]. While Angkoso et al. used CNN with BET, the accuracy in axial was 0.86, coronal was 0.85, and sagittal was 0.85, respectively [45]. In other studies, Altaf et al. used KNN, gray level cooccurrence matrix, and segmented the region, namely grey matter, white matter (WM), and cerebrospinal fluid (CSF), showing (0.79) accuracies [48]. Shi et al. used Conv-LSTM and showed the accuracy in axial (0.59), coronal (0.57), and sagittal (0.52) [49]. Our selecting slices method only focused on landmarks in the hippocampus region without any computational process. However, the proposed method still showed higher accuracy than other studies among two models and three views.
In order to verify the performance of our proposed method from the other works of literature, see Table 4. The study by Valliani et al. used pretrained Resnet with augmentation in axial view and showed an accuracy of 0.56 [47]. While Angkoso et al. used CNN with BET, the accuracy in axial was 0.86, coronal was 0.85, and sagittal was 0.85, respectively [45]. In other studies, Altaf et al. used KNN, gray level cooccurrence matrix, and segmented the region, namely grey matter, white matter (WM), and cerebrospinal fluid (CSF), showing (0.79) accuracies [48]. Shi et al. used Conv-LSTM and showed the accuracy in axial (0.59), coronal (0.57), and sagittal (0.52) [49]. Our selecting slices method only focused on landmarks in the hippocampus region without any computational process. However, the proposed method still showed higher accuracy than other studies among two models and three views.
According to Figure 6(b), we found that the LeNet model in coronal view showed higher accuracy than the axial and sagittal. It can be evident that the coronal view may contain reliable information about the landmark in the hippocampus region. A similar finding in a study by Raju et al., the hippocampus region was the most discriminative in the coronal view, showing clearly [50]. Thus, the coronal view on MRI images may be more informative for early AD detection. However, our finding showed higher accuracy in the coronal view, the same as the medical expert usually uses to see the hippocampus volume changes in MRI images when diagnosing people with AD. Table 2 also showed that LeNet models showed higher accuracy in three views. It can be observed that the LeNet can achieve the most convincing results, with an average of 0.98. Our results were similar to the other study by Hazarike et al. [51]. For this reason, the LeNet model might train all the layers on MRI images for the classification task. Additionally, LeNet is well-known for its simple yet effective architecture for classification problems and showed effective performance [52]. The LeNet model is beneficial for classification tasks using MRI images.
Moreover, using Pretrained Resnet50 showed an average performance of 0.97. At the same time, the Resnet50 was lower than using Pretrained Resnet50 and LeNet, with an average performance in three views of 0.93. For this reason, Resnet50, when using a pretrained weight, the models are already trained on a larger benchmark dataset like Imagenet [53]. In contrast, pretrained model, Resnet50 starts the weight initialization from learned weights from the imagenet. In a previous study, the models that used pretrained weights were good at detecting high-level features like edges and patterns and could improve model performance [54]. The Pretrained Resnet50 is more likely to understand certain basic feature representations which can be used in our dataset. This condition helps in quicker convergence when training with a relatively smaller dataset.
Furthermore, even though we earned the result for each category in the performance of multiclass classification (AD, MCI, and NC), we found that MCI in axial view with Resnet50 showed lower precision, recall, and F1 score in a range of 0.81 to 0.82. For this reason, MCI has a high probability of misdiagnosing AD, and the structural changes in MCI are relatively subtle [24], [55]. The studies may get evidence that MCI is more challenging for classification tasks than AD and NC in classification tasks.
However, we still have limitations for this study. First, for selecting slices, we manually selected one by one in three views from the entire slice in MRI images. We have 300 subjects with 149,805 MRI slices in three views and three categories. Then, in total, we manually selected 4,500 MRI slices. However, it might be beneficial to use deep learning to select the slices automatically. A similar effort in the previous study used deep learning to predict the label of medical images [56]. Future studies may label the MRI slices for a fully automatic system to detect select slices containing landmarks in the hippocampus region. Second, we did not use data augmentation to train our performance model. A similar effort in a study by Liu et al., using data augmentation, can increase classification accuracy [57]. In future studies, we could increase our accuracy and generate more training data using data augmentation.

V. CONCLUSION
This study supports our hypothesis that selecting slices can improve machine learning performance, and selecting slices in MRI images significantly improves the accuracy of AD classification. Furthermore, we found that the coronal view results have higher accuracy than the axial and sagittal views. Additionally, we found that the LeNet model showed higher performance on the AD classification. Then, the multiclass classification result can be used to see the value of each category for AD classification.
YORI PUSPARANI is currently pursuing the Ph.D. degree in digital media design with Asia University. She is an Assistant Professor and a full-time Lecturer with the Faculty of Communication and Creative Design, Budi Luhur University, Jakarta, Indonesia. She has established a reputation as a quick learner, an effective communicator, a strong leader, and a valuable team player. Possessing outstanding flexibility, she has demonstrated exceptional responsibility, commitment, and presentation skills while showcasing proficiency in computer applications, including Adobe Illustrator, Adobe Photoshop, and Adobe Indesign. Her research interests include artificial intelligence, deep learning, plantar pressure, Alzheimer's disease, and other medical imaging fields. She understands the challenges inherent in analyzing medical images and recognizes AI technology's potential to contribute significantly to scientific articles in this area.
CHIH-YANG LIN (Senior Member, IEEE) is currently with the Department of Mechanical Engineering, National Central University, Taoyuan, Taiwan. He has been recognized as an IET Fellow and has contributed more than 150 papers that have been featured in a wide range of international conferences and journals. His research interests include computer vision, machine learning, deep learning, image processing, big data analysis, and the design of surveillance systems. WEN-HUNG CHAO received the M.S. degree in digital media (architecture) and design from The University of Adelaide, Australia, and the Ph.D. degree in digital media (architecture) and design from Asia University, Taiwan. He is currently an Associate Professor with Asia University. His research interests include AR augmented reality design application and research, digital game-based learning, and technology-enhanced learning motivation.

YIH-KUEN JAN
CHI-WEN LUNG received the bachelor's degree in industrial design from Dayeh University, the master's degree in industrial design from Tatung University, and the Ph.D. degree in biomedical engineering from National Yang-Ming University, Taiwan. Currently, he is a Professor with the Department of Creative Product Design, and the Department of Bioinformatics and Medical Engineering, Asia University, Taiwan. His current research interests include soft tissue biomechanics and its role in the development of musculoskeletal injuries and pressure injuries. In addition, he developed advanced rehabilitation research instruments.