Next Article in Journal
Point-of-Care Ultrasonography as an Extension of the Physical Examination for Abdominal Pain in the Emergency Department: The Diagnosis of Small-Bowel Volvulus as a Rare Complication after Changing the Feeding Jejunostomy Tube
Previous Article in Journal
Complications after Thermal Ablation of Hepatocellular Carcinoma and Liver Metastases: Imaging Findings
 
 
Font Type:
Arial Georgia Verdana
Font Size:
Aa Aa Aa
Line Spacing:
Column Width:
Background:
Article

Multi-Classification of Breast Cancer Lesions in Histopathological Images Using DEEP_Pachi: Multiple Self-Attention Head

1
School of Information and Software Engineering, University of Electronic Science and Technology of China, Chengdu 610054, China
2
School of Management and Economics, University of Electronic Science and Technology of China, Chengdu 610054, China
3
School of Computer Science and Engineering, University of Electronic Science and Technology of China, Chengdu 610054, China
*
Authors to whom correspondence should be addressed.
Diagnostics 2022, 12(5), 1152; https://doi.org/10.3390/diagnostics12051152
Submission received: 1 April 2022 / Revised: 23 April 2022 / Accepted: 28 April 2022 / Published: 5 May 2022
(This article belongs to the Section Medical Imaging and Theranostics)

Abstract

:
Introduction and Background: Despite fast developments in the medical field, histological diagnosis is still regarded as the benchmark in cancer diagnosis. However, the input image feature extraction that is used to determine the severity of cancer at various magnifications is harrowing since manual procedures are biased, time consuming, labor intensive, and error-prone. Current state-of-the-art deep learning approaches for breast histopathology image classification take features from entire images (generic features). Thus, they are likely to overlook the essential image features for the unnecessary features, resulting in an incorrect diagnosis of breast histopathology imaging and leading to mortality. Methods: This discrepancy prompted us to develop DEEP_Pachi for classifying breast histopathology images at various magnifications. The suggested DEEP_Pachi collects global and regional features that are essential for effective breast histopathology image classification. The proposed model backbone is an ensemble of DenseNet201 and VGG16 architecture. The ensemble model extracts global features (generic image information), whereas DEEP_Pachi extracts spatial information (regions of interest). Statistically, the evaluation of the proposed model was performed on publicly available dataset: BreakHis and ICIAR 2018 Challenge datasets. Results: A detailed evaluation of the proposed model’s accuracy, sensitivity, precision, specificity, and f1-score metrics revealed the usefulness of the backbone model and the DEEP_Pachi model for image classifying. The suggested technique outperformed state-of-the-art classifiers, achieving an accuracy of 1.0 for the benign class and 0.99 for the malignant class in all magnifications of BreakHis datasets and an accuracy of 1.0 on the ICIAR 2018 Challenge dataset. Conclusions: The acquired findings were significantly resilient and proved helpful for the suggested system to assist experts at big medical institutions, resulting in early breast cancer diagnosis and a reduction in the death rate.

1. Introduction

Cancer is among the majority of deadly diseases, claiming the lives of millions of people each year. Breast Cancer (BC) is the most common cancer and the leading cause of death among women [1]. As per World Health Organization (WHO) data, 460,000 people die annually from BC out of 1,350,000 cases [2]. The United States (US) alone recorded about 268,600 instances of BC in 2019, setting a new record [3,4]. BC develops due to aberrant cell proliferation inside the breast [5]. The breast anatomy comprises several blood arteries, tendons and ligaments, milk ducts, lacrimal gland, and lymph ducts [6]. Benign carcinoma is squamous cell carcinoma that forms due to minor anomalies in the breast. Malignant carcinoma, in contrast, is classed as melanoma and further characterized as invasive carcinoma or in situ carcinoma [7]. Invasive BC expands to nearby organs and causes difficulties [8,9], whereas in situ carcinoma stays limited to its territory and does not affect surrounding tissues. To avoid future progression and problems, BC must be identified earlier and correctly classified as benign or malignant carcinoma. As a result, a prompt and accurate therapy may be devised, lowering the disease’s fatality rate. Diverse imaging techniques are used to identify BC, such as Histopathology (HP) [10], Computed Tomography (CT) [11], Magnetic Resonance Imaging (MRI) [12], Ultrasound (US) [13], Mammograms (MGs) [14], and Positron Emission Tomography (PET). Statistics reported in recently published studies on imaging methods [15] reveal that 50% of datasets utilized in BC-related research are MGs, 20% are US, 18% are MRI and 8% are HP. The remaining percentage includes commercial records and data from different forms [6,12,16]. Further studies prove that HP images do not offer binary identification and classifications but support the multiclass identification and classification of BC subtypes [17,18,19]. In this paper, a BHI dataset at various magnifications (40×, 100×, 200×, 400×) is studied. The preprocessing of various magnification varies. For instance, with 100× magnification, a specialist examines squamous development, mesenchymal involvement, and tumor localization to determine the carcinoma. Nevertheless, developing an accurate and fast model to evaluate BHI at various magnifications is difficult due to multiple factors such as variable pixel intensity, microscopic size nucleus, diverse image characteristics, a wide variation of nuclei, the existence of distortions, and so on. The current effort aims to create a deep learning-based attention model to categorize BHIs in various magnifications.
Several strategies have been studied for classifying BHIs under 100× magnification [20,21]. Conventional approaches are always focused on feature extraction. On the other hand, finding relevant handmade characteristics necessitates experience and expertise but these might fail to grasp all permutations in the dataset. Deep learning-based approaches have recently gained prominence as processing computing capacity has improved. Their ability to analyze end-to-end provides it a better choice for BHI classification. Convolutional layers are used in deep learning algorithms to extract input image features. These convolutional layers often extract unwanted features alongside the needed parts or overlook the essential features. However, the extracted features influence the result and choice of malignancy; thus, disregarding these aspects may result in incorrect image evaluation. As a result, the extracted characteristics by the convolutional layers of CNN are insufficient for classifying BHIs. We present an attention-based deep learning framework that employs global and local features to determine tumor malignancy. The mechanism of the human brain to interpret visual data while still analyzing the significance of input elements is known as attention. This neurological mechanism enables exclusive focus on a single piece of information while ignoring other discernible details. Nevertheless, in opposition to the competency of attention, the conventional and commonly used CNN classifier examines characteristics more broadly. It is not assured of extracting relevant clinical knowledge subconsciously comparable to trained networks [22]. Self-attention is a significant advancement of computer vision [23,24,25,26,27,28]. These advancements focus exclusively on essential features in an informal m with no external guidance. The CNN models serve as the backbone of the self-attention models. They are trained end-to-end, with no modifications in the training phase. Thus, employing self-attention processes inside conventional CNN yields several advantages in accuracy, comprehensibility, and robustness on clinical vision tasks.

1.1. Diagnostic Medical Methods Used in the Investigation of BC

Having mentioned several medical imaging methods used in diagnosing BC, this paper describes the imaging methods related to our task and why we chose histopathological images in this section. PET is an accepted imaging method that might provide handy information regarding BC; nonetheless, it is usually utilized for early grading of advanced or metastatic invasive and reactive breast cancers, assessing progress to therapies and detecting and localizing family history of the disease [29]. As a result, we did not include it in our discussion.
The most often and extensively used technique is MGs [30,31,32], as they are easily accessible as public datasets. MGs are small breast X-rays [33] that are simple and frequently employed as the initial test for BC identification [34]. Regrettably, because of the vast discrepancies in shape, the surface area of breast tissues and morphological form, these are not reliable as they are associated with health effects, including radiation exposure risks for carriers and radiologists and overdose of radiation effects for carriers [35]. Moreover, due to inadequate specificity, these techniques subject a considerable proportion of the population (65–85%) to unnecessary biopsy procedures [36,37]. Such unnecessary biopsies increase the hospitalization cost for individuals and cause mental stress. Due to such limitations, US imaging is considered a much better option for breast cancer diagnosis and detection [38,39].
US imaging can significantly boost detection accuracy by 17% while decreasing overall needless biopsy procedures by 40% [39] compared with MGs. Sonograms are another title for breast US in clinical medicine. US might be a superior option to MGs for BC assessment and diagnosis due to its adaptability, reliability, sensitivity, and selectivity [40]. On the other hand, BC lesion identification and classification with US imaging need radiologists’ experience and knowledge due to its complexity and speckle [41]. Aside from the complicated imaging form, US image-based assessment in female patients produces unsatisfactory false detection results and misclassification [42]. As a result, there is insufficient evidence to recommend the use of US in the diagnosis and treatment of BC.
MRI breast images yield better sensitivity for detecting BC in dense tissue [43]. MRI images provide a more thorough overview of breast tendons than CT, US, or MGs images because multiple samples from different angles constitute a patient’s breast image sample [44]. Since MRI scans are more comprehensive than other alternative imaging techniques, they may uncover tumors not apparent on different imaging techniques or be deemed malignant [45]. Despite MRI’s high sensitivity [46], its adoption for BC diagnosis is limited due to its expensive cost [47]. Conversely, newer MRI methods, such as DWI (Diffusion-Weighted Imaging) and UFMRI (Ultrafast Breast MRI), provide much improved diagnostic precision with faster processing efficiency and lower expenses [48,49].
HP is the process of removing a heap from a questionable anatomical and physiological spot for screening and extensive investigation by specialists [50]. In clinical medicine, this procedure is commonly referred to as a biopsy. Biopsy specimens are mounted over a microscope slide clouded with Hematoxylin and Eosin (H&E) for examination [51]. HP images come in two types: (i) Whole Slide Images (WS), which are computerized color imaging, and (ii) image patches derived from WSI. Several researchers have effectively employed HP images in the multiclassification of BC due to tissue level examination [17,18,19]. BC identification and classification with HP images has several benefits over MGs and other imaging alternatives such as MRI and US. In particular, HP images do not offer only binary identification and classifications but support multiclass identification and classification of BC subtypes. Table 1 illustrates the summary of the discussed Breast cancer modalities, its robustness, constraints and available datasets.

1.2. Related Studies

The AI approach’s classification of BHI has received much attention in the research field [10,65,66,67]. There are significant obstacles in developing AI systems to examine these images, such as cancerous specimen variability, illumination variations and hue variations, intraclass fluctuations, different magnifications, and the existence of abnormalities, among others. Researchers used the traditional technique and deep learning models, which are further explored below and summarized in Table 2.
Various conventional approaches to image analysis have been presented by numerous scholars [68,69,70,71]. These approaches include several phases, such as the preprocessing phase, region of interest segmentation phase, the extraction of features phase, and identification phase. In Refs. [71,72], Local Binary Patterns (LBP) were used for BHI categorization, while the authors of Ref. [73] used the frequency distribution index, in conjunction with contours, to identify meiosis. Unfortunately, due to the varied properties of cancerous images, appearance alone will be inadequate for effective image classification. Furthermore, support vector machines (SVM) [71] and decision trees (DT) [74,75] have been widely investigated for image classification. These strategies focused on data preprocessing since it significantly influenced the recognition rate. Such techniques depend on characteristics that have been handcrafted. Furthermore, detecting these handcrafted traits necessitates technical knowledge and expertise. Moreover, these characteristics might not perfectly capture all variabilities in the sample, resulting in poorer predictive performance.
The ability of Deep Learning models to represent complicated patterns has made them a common approach for image processing. Several CNN-based methods such as ResNet, VGG-16, Inception, VGG-19, and others were proposed for image classification tasks. Ref. [76] authors employed Deep CNN for BHI classification. The authors of Ref. [8] used CNN to detect invasive BC. In contrast, the author of Ref. [77] used the same CNN approach to address the sample class imbalance and extractions of input image features at various BHIs Magnification. The authors of Ref. [78] employed the Residual neural network for automated BHI assessment. The authors of Ref. [79] combined CNN and Residual neural network for multi-level feature extraction. The authors of Ref. [80] argued for the integration of squeeze and excitation blocks and residual neural network yields compared to Ref [79] for this classification. The authors of Ref. [81] suggested that the combination of Ref. [79] and Ref. [80] yields a better result. They used Ref. [80]’s approach to extract the input image features in Latent space and used an attention mechanism [80] for classification. Transfer learning [82,83,84] has been widely investigated as it provides room for better model performance where there are few training samples. Ref [85] used Inception with a residual connection model via transfer learning for more feature extraction. Ref. [86] entails using CNN’s wavelet decomposition for image classification. Ref. [87] integrated a soft attention network to its architecture to focus entirely on the region of interest alone. At the same time, the author of Ref. [88] designed a class-specific Deep CNN network for BHIs multiclass classification. To tackle the computational cost of processing huge images, the authors of Ref. [89] developed a dual-stage CNN. The authors of Ref [90] integrated the idea of Refs. [76,86]. They used adaptive spectral composition and an attention technique [90] for classification.
Several researchers have employed the hybrid technique to seek a better and more accurate BHI classification model. The authors of Ref. [91] used the ensemble of ResNet50, VGG19 and VGG16 as feature extractors for a logistic regressor classifier. The authors of Ref. [92] suggested that a cascaded ensemble model with an SVM classifier yields better and more accurate results. The cascaded ensemble is seen at the feature extraction (multi-lateral and syntactic feature) by the CNN model. Ref. [92] created an ensemble of DenseNet121, InceptionV3, ResNet50, and VGG-16 as feature extractors. Ref. [93] investigated several Deep learning pre-trained models as feature extractors and used SVM as classifiers. Unfortunately, CNN-based techniques require a substantial amount of labeled training samples. Much research that focused on patch level [94] feature extraction and image-level [95] feature extraction for BHIs classification has been performed. The author of Ref. [95] used a voting principle for the classification after extracting input image features via image and patch levels. In contrast, the authors of Ref [94] employed pre-trained models (ResNet and Inception architecture) for input image feature extraction via images and patch level. Notwithstanding, there are chances where the input images analyzed for patch features fail to contain RIO, thus yielding false malignancy results as they might not adequately depict the input image.
Research has proposed numerous convolutional neural network-based classification architectures for BHIs to extract features from the entire input image. This approach mostly fails as the network might overlook the essential features. The identified properties/regions of the input images that might be overlooked are the cores, proliferative cells, and ducts, which are critical in determining the tumor’s malignancy. As a result, neglecting certain traits may impact outcomes. Furthermore, extracting distinctive features at different magnifications is difficult due to the tiny size of cores. To address these constraints of multiclassification of BC using the BHI dataset, this article proposes “DEEP_Pachi”, an end-to-end deep learning model incorporating multiple self-attention network heads and Multilayer Perceptron. The input images are processed as a series of patches. Each patch is squished into a single feature vector by merging the layers of all pixels in a patch and then exponentially extending it to the appropriate input dimension. Even though the proposed architectures require more training samples than CNN architectures, the most typical approach is to use a pre-trained network and then to finetune it on a smaller task sample. This paper used the option of pre-trained networks to mitigate the issues of more training sample requirements of the proposed model. To select the pre-trained networks, we first examine four pre-trained deep learning models (DensetNet201, VGG16, InceptionResNetV2, and Xception network) on BHIs images using a transfer learning technique. Afterward, an ensemble of pre-trained models functioned as feature extractors for the DEEP_Pachi network. We propose an automated method to distinguish between benign breast tumors such as Adenosis, Fibroadenoma, Phyllodes_tumor, and Tubular_adenoma and malignant breast tumors Ductal_carcinoma, Lobular_Carcinoma, Mucinous_Cancinoma, and Papillary_carcinoma to help medical diagnosis even when professional radiologists are not accessible. Furthermore, to provide a point of comparison for our findings, the proposed method is compared to other baseline models and recently published research.
The significant contribution of this paper is summarized as follows:
This research reviews several Medical BC imaging techniques, their robustness and limitation, and associated public dataset.
This paper proposed a fine-tuned approach termed “DEEP_Pachi,” an end-to-end deep learning model incorporating multiple self-attention network heads and Multilayer Perceptron for the multiclassification of Breast cancer diseases using histopathological images.
According to the comprehensive study via transfer learning experiment, the suggested feature extractor discriminates remarkably between benign breast tumors such as Adenosis, Fibroadenoma, Phyllodes_tumor and Tubular_adenoma malignant breast tumors Ductal_carcinoma, Lobular_Carcinoma, Mucinous_Cancinoma, and Papillary_carcinoma to help medical diagnosis even when professional radiologists are not accessible.
We reported a well robust deep learning method in Accuracy, Specificity, Sensitivity, Precision, F1 Score, Confusion matrix, and AUC using receiver operating characteristics (ROC) for the multiclassification of Breast cancer diseases using histopathological images based on the detailed experimental evaluation of the proposed model and comparison with state-of-the-art results.
Finally, this research suggests that the proposed model “DEEP_Pachi” can also be used to increase ensemble deep learning models’ detection and classification accuracies.
The remainder of this article is organized as follows; Section 1 is devoted to the introduction and relevant studies of this research. Section 2 outlines the materials, the proposed approach, and the evaluation measures. Section 3 introduces the experimental setup and outcomes, whereas Section 4 explains the results. Section 5 discusses the conclusion and future studies.

2. Materials and Methods

This section examines the suggested architecture and materials in depth. The implementation structure of this research is depicted in Figure 1. First, this paper argues that data preprocessing should only be applied to the training set because when test set data are preprocessed, there is every likelihood that the training model will perform poorly in real-time; thus, the first step in this paper was to split the dataset downloaded from the database. After splitting the dataset into train and test sets, data preparation procedures such as scaling, rotation, cropping, and normalization are performed in the train set. To make our model robust enough, transfer learning was used as the network backbone’s (feature extraction). While selecting the optimum network backbone for the proposed model, this paper conducted an experimental examination on four deep learning pre-trained models. On the other hand, researchers have argued that ensemble models provide more generalized results than single models; hence, we adopted the ensemble architecture for the proposed network backbone. The ensemble network now serves as the input to the proposed model (DEEP_Pach). The proposed model comprises a self-attention network and an MLP block, as seen in Figure 2. The self-attention network receives the input in two forms: patch embedding and position embedding. This helps the self-attention network differentiate between the various symptoms in the fed images. The multilayer perceptron (MLP) block improves the self-attention network’s outcomes in false symptom detection in the fed dataset. The input evaluated by the self-attention network is transferred to the multilayer perceptron layer for extraction before being passed to the classification/detection layer for prediction. We go over the following stages for putting our suggested approach into action.
Step 1: Data collection, splitting, and data preprocessing
Step 2: Backbone selection and Ensembling for more robust and generalized features. The examined models were DenseNet201, VGG16, Xception, and InceptionResNetV3 architecture.
Step 3: Feeding the extracted features from the ensemble model into DEEP_Pach architecture.
Step 4: This is the last stage of the proposed model: the identification and classification stage. The learned features are passed into the classification layer for the final result prediction.
Step 5: Then, evaluation with the test set is performed after training.

2.1. Dataset

BreaKHis, the broadest currently accessible dataset of BC histopathology images, was introduced by the authors of Ref. [4]. The dataset was obtained in brazil at the Pathological Anatomy and Cytopathology (P&D) Lab. Eighty-two patients were diagonalized, generating Benign microscopic images (BI) and Malignant images (MI) in several magnifications. The BI is 2480 in number while MI is 5429, totaling 7909 images. The generated microscopic images magnification includes 40×, 100×, 200×, and 400×. Figure 3 shows the pictorial illustration of the BreaKHis dataset. It depicts the binary classification, Benign vs. Malignant, and each class’s subclass. The benign classes include the following adenosis (A), fibroadenoma (F), phyllodes_tumor (PT), and tubular_adenoma (TA), while the malignant classes include ductal_carcinoma (DC), lobular_carcinoma (LC), mucinous_carcinoma (MC), and papillary_carcinoma (PC). Table 3 summarizes the distribution of the employed BreaKHis dataset.

2.2. Data Pre-Processing/Augmentation

The first step towards the employed dataset was to augment the data as the number of samples in each subclass varies. Moreover, it is worthy to note that deep learning models require a massive quantity of data to increase their performance or minimize the rate of misdetection and classification of the minority samples. Table 4 shows the type of data argumentation carried out in this paper. Augmentor is a Python library used by researchers to increase the number of samples.
The Python Augmentor library was only used on a different Python script to generate the training samples as the original samples were kept for evaluation of the model. Samples numbering 1500 were generated for training in each magnification for benign and malignant. The TensorFlow data loader function was used during training to augment the train set further. Images were rescaled (rescale operation indicates image magnification or reduction) using the 1./255 ratio: zoom range = 0.2, rotation range = 1, and horizontal flip = True. The rotation range specifies the span under which the images were spontaneously rotated throughout training. Zoom range dynamically zooms the images to a ratio of 0.2 percent, and the images were eventually flipped horizontally.

2.3. Network Backbone

The proposed network backbone in this study is the ensemble of two deep-learning models via the transfer learning approach. Four deep learning pretrained models were first examined using the malignant subclass magnification of the BreaKHis dataset: the DenseNet201 and the VGG16 architecture produced a better classification performance among the four examined models. Hence, we used both as the network backbone via the ensemble approach. Ensembling is the capacity to combine several learning algorithms to obtain their collective performance, i.e., to improve the performance of existing models by integrating many models into a single trustworthy model. The network backbone serves as feature extractors to the proposed model DEEP_Pachi, as seen in Figure 4.
VGG16 [96]: VGG16 consists of 16 layers. Following preprocessing, the captured values are fed into a stacked Convolutional layer with 3 × 3 receptive-field filters and a fixed stride of 1. Following that, five max-pooling convolutional layers are used to perform spatial pooling. A 2 × 2 filter’s max-pooling layer is run with a stride of 2. To finalize the design, two fully connected layers (FC) and SoftMax (for the output) are added at the end of the final convolution.
DenseNet201 [97]: This architecture assures information flow across network levels by linking each layer to each layer in a feed-forward fashion (with equal feature-map size). It concatenates (.) the previous layer’s output with the output of the next layer. The transition layers consist of a 1 × 1 convolution followed by a 2 × 2 average pooling. Global pooling is utilized after the last dense block before applying SoftMax.
Table 5 summarises the parameters of all implemented models in this article.

2.4. DEEP_Pachi Architecture

The proposed architecture is based on an attention mechanism and multilinear perceptron [98]. The attention mechanism is self-attention. The attention function is the mapping to an output of a set of keys, value pairs, and a query. The weights allocated to each value are determined by the query compatibility function with the relevant key, whereas the weighted sum of the values results in the output. Considering an input with dimension d k of queries and keys and dimension d v , the dot product of all the queries with keys are computed by dividing each with d k while using SoftMax to ascertain the weights on the values. The attention matrix contains a set of queries Q, keys K, and values V, which are used to compute the attention function simultaneously.
A t t e n t i o n ( Q , K , V ) = s o f t m a x ( Q K T d k ) V
Multi-head attention allows the model to simultaneously attend to inputs from several representation subspaces at various locations. Figure 5 elaborates the computation performed by multi-head self-attention:
M u l t i H e a d ( Q , K , V ) = C o n c a t ( h e a d 1 , , h e a d h ) W O
where h e a d i = A t t e n t i o n ( Q W i Q , K W i K , V W i V ) .
The parameter matrices are projections W i Q d m o d e l d k , W i K d m o d e l d k , W I V d m o d e l d k , and W O h d i d m o d e l . MLP is made up of two GELU non-linearity layers.
z 0 = [ x c l a s s ; x p 1 E ; x p 2 E ; ; x p N E ] + E P O S ,   E ( p 2   × C ) × D ,   E p o s ( N + 1 ) × D  
z l I = M S A ( L N ( z l 1 ) ) + z l 1 ,   l = 1   .   .   .   .   . L
z l = M L P ( L N ( z I l ) ) + z l I ,   l = 1 .   L
y = L N ( z l 0 ) .
The classification head is implemented with one hidden layer during pre-training (Equation (5)) and a single linear layer (Equation (6)) during finetuning by an MLP. This paper uses the SoftMax layer after the MLP Block to accurately detect a sample. The SoftMax layer’s primary function converts the encoding layer’s output information into a likelihood interval (0, 1). We considered detection as a multi-classification issue in this study. After that, we send input samples to the encoding network, for which its outputs are then transferred into the likelihood interval (0, n) via the SoftMax layer, as seen in Equation (7):
l i = P ( t i | S i ) = 1 1 + e ( W c u + b c ) ε ( 0 , n )
where the weight matrix and the bias term are denoted as W c and b c , respectively. We used categorical_smooth_loss to calculate the loss between the ground truth and the detected item. Categorical_smooth_loss is the addition of smoothing of the label’s functions to the cross-entropy loss function.

2.5. Experimental Setup

This experiment was performed using an Intel(R) Core (TM) i9-10850K CPU @ 3.60 GHz, 64.0 GB RAM Desktop Computer, and an NVIDIA GEFORCE RTX-3080 Ti 10 GB graphics processing unit (GPU). We use open-source libraries such as Keras and TensorFlow to implement this. The experimental parameters for all of the studies documented in this work remained consistent during training: reduce learning rate (factor of 0.2, epsilon = 0.001, patience = 10, verbose = 1), es callback (early stopping, patience = 10), Adam optimizer, clip value of 0.2, and an epoch of 100. An epoch of 50 was utilized to select the pre-trained models, while all other parameters remained fixed as in the main experiment. In the encoder implementation, patch size = (2, 2), drop rate = 0.01 for all the layers, number of heads = 8, embed_dim = 64, num_mlp = 256, window size//2, and then the global average pooling for the shift size.

2.6. Evaluation

The proposed model used various evaluation metrics to evaluate the robustness of the model. The metrics include Accuracy, Precision, Specificity, F1-score, Sensitivity, and area under a receiver operating characteristic curve (AUC). The predefined notations are TP = True Positive, FP = False Positive, TN = True Negative, and FN = False Negative. We defined classification Accuracy (ACC) as follows.
A C C = T P + T N ( T P + T N ) + ( F P + F N ) × 100
Precision (PRE) is defined as follows.
P R E = T P T P + F P × 100
Specificity ( S P E ) is defined as follows.
S P E = T N N × 100 = T N T N + F P × 100
Sensitivity (SEN) is mathematically formulated as follows.
S E N = T P P × 100 = T P T P + F N × 100
The Precision and Sensitivity harmonic means are referred to as the F 1   s c o r e , mathematically represented as thus.
F 1 = ( S E N 1 + P R C 1 2 ) 1 = 2 × T P 2 × T P + F P + F N
The AUC measures a classifier’s performance, while the probability curve is obtained from plotting at different threshold settings, the FP rate is referred to as the ROC (Receiver Operating Characteristic). The AUC indicates how well the model distinguishes between the given instances. The higher the AUC, the better. AUC = 1 implies a perfect classifier, whereas AUC = 0.5 suggests a classifier randomizing class observation. To determine the area under the ROC curve, AUC is calculated using trapezoidal integration.

3. Results

This section describes the results of the experiment. The parameter sensitivity experiment was first presented in this section to guide readers on how the proposed model parameter was selected for optimal performance. The transfer learning, binary, and multiclass experimental results were discussed using the employed evaluation metrics and compared with the state-of-the-art results.

3.1. Parameter Sensitivity Analysis of the Proposed Method

This paper carried out a parameter sensitivity analysis of the optimal number of heads and feature extractors to ascertain the parameter setting for the proposed model’s best and worst performance scenario. The number of epochs and learning rate is kept constant during this experiment. The evaluation metrics used here include accuracy, precision, and F1_score. The obtained result is recorded in Table 6. The computational cost was considered during the parameter sensitivity analysis; hence, only two, four, and eight numbers of self-attention heads and one, two, and three backbones were set up in the analysis. The backbone models used for this analysis were DenseNet201, VGG16, and Xception architecture. It was observed that using only one pre-trained network as the proposed model backbone with different numbers of self-attention heads does not have any significant result enhancement; thus, we focused on using only two and three pre-trained networks for the optimal feature selection approach. The best accuracy, F-1 score, and precision were obtained when the number of self-attention network heads is set from four using two pre-trained networks. The optimal best parameter setting of the proposed model is seen while using three pre-trained models as network backbone and setting the number of self-attention heads = 16. Although there was a minimal difference from using two pre-trained models and four self-attention heads, this paper used two pretrained model backbones and set the number of self-attention heads to be eight in all experiments to reduce the computational cost of the proposed model. The malignant class of the BreaKHis dataset was used in this evaluation. We combined all the malignant magnification subclasses into a binary classification task. We combined the 40× and the 100× magnification for low-quality image resolution while combining 200× and 400× magnification for the high-quality image resolution. We used 80 percent for training and 20% for the test during this analysis.

3.2. Transfer Learning Experiment for Backbone Network Selection

Having first obtaining the optimal best performance using the number of self-attention networks and number of pre-trained models for the backbone, we carried out a detailed experiment using both the Benign class and the Malignant class on various magnifications, as recorded in Table 7. From the recorded results, the transfer learning models performed very well in the benign class; hence, we focused our attention on the malignant class for backbone network selection. The excellent results of the models using the Benign class can be traced to the data preprocessing technique employed in this paper. The DenseNet201 architecture had the best result in all magnification (40×, 100×, 200×, and 400×). By comparing the recorded results, the malignant class’s results in all magnifications are lower than the benign class. VGG16 results show how robust the model is on both low and high-image resolutions compared to the Xception model. However, they recorded almost the same results in this experiment. The InceptionResNet is the least performing model; hence, DenseNet and the VGG16 were selected for the network backbone.

3.3. DEEP_Pachi Architecture Classification Result

For ideal and well-detailed microscopic image analysis, the magnification factor plays a significant role; hence, this paper experimented on all BreaKHis dataset magnification (40×, 100×, 200×, and 400×). However, before then, a Binary classification was carried out on the BreaKHis dataset combing all 100× and 400× magnifications for the benign and malignant class. The reason behind selecting only the 100× and the 400× magnification was to analyze the robustness of the model in low and high-quality image resolution and have a neutral experiment without data augmentation. The binary classification is shown in Table 8. The evaluation was between the backbone network, the Ensemble of DenseNet architecture and VGG16 and the DEEP_Pachi model (Proposed model). We can see a significant contribution of the proposed model with 0.1% improvements in the Benign class and +0.1–+0.3% improvements in the Malignant class. Figure 6 visualizes the class performance of each model using the Precision–Recall curve and the Reciever Operating Characteristics (ROC) Curve.
Table 9 depicts the multiclass classification of the BreaKHis dataset. Since the Benign class has described excellent results due to the ideal preprocessing techniques used in this paper, we focused our discussion more on the Malignant class. Comparing the network backbone classification performance using the Accuracy, Sensitivity, Specificity, Precision, F1-score and AUC evaluation metrics, the DEEP_Pachi architecture significantly improved by +0.1–+0.3% classification performance. Figure 7 visualized the Benign individual class performance using the Precision-Recall (PR) curve and the Reciever Operating Characteristics (ROC) Curve while Figure 8 visualized the Benign individual class performance using the Precision-Recall (PR) curve and the Reciever Operating Characteristics (ROC) Curve.

4. Discussion

Table 9 shows the multiclass classification performance of the proposed model vs. the backbone model (Ensemble model). Using the Precision–Recall (PR) curve and the Receiver Operating Characteristics (ROC) Curve as shown in Figure 8, the individual performances of Malignant Ductal_carcinoma, Lobular_Carcinoma, Mucinous_Cancinoma, and Papillary_carcinoma were recorded. Table 9 reveals that DEEP_Pachi classification accuracy is substantially higher than that of the Backbone model, which is four classes, with greater accuracy of at least 0.3%. These findings demonstrate that the DEEP_Pachi models significantly enhanced the accuracy of the BC classifier. These models can capture more essential tumor cell properties than traditional DL architectures. Conventional DL models comprised shallow convolution layers, which were insufficient for extracting the unique properties of BC cells, and this was a difficult task due to the significant variations of H&E staining. DEEP_Pachi models, on the other hand, can capture comprehensive information from breast types of cells, indicating the similarity of BC cells to normal breast cells. An intense network was used as our network backbone, which was critical for retaining the inherent ordering of items. In backbone models, low-level characteristics were recorded, and object pieces were retrieved at higher levels. Furthermore, the attention mechanism raises feature levels, resulting in better classification performance.
Figure 7 shows the ROC and the PR curve of the benign multiclass classification while Figure 8 shows malignant multiclass classification. The mucinous carcinoma and the papillary carcinoma attend the highest area and AP in the malignant class, whereas lobular carcinoma recorded the lowest AP and Area. Table 9 shows that when the results of the DEEP_Pachi architecture are compared to the state-of-the-art results, the backbone model alone achieves a higher accuracy for the multiclassification task. The accuracy of the backbone model alone was at least 3% greater than any of the state-of-the-art models. This demonstrates that this model can use the deep network architecture of multi-resolution input images to collect multi-scale relevant information and the benefits of its single models. The DEEP_Pachi model outperforms the multiclass classification by a margin for binary classification. This is because the various classes are not dissimilar and share many characteristics. The findings show that the backbone model outperformed the other algorithms in the binary classification task, with a total accuracy of 99%. Table 9 also shows the backbone model’s sensitivity, Sensitivity, Precision, F1-Score, and AUC vs. the DEEP_Pachi. Because our model can capture multi-level and multi-scale data and distinguish individual nucleus features and hierarchical organization, the DEEP_Pachi performed well. DEEP_Pachi may also learn features at multiple sizes through its convolutional layers. As a result, it can accurately distinguish individual nuclei and nuclei structures. The experimental findings reveal that the ensemble technique outperforms all other approaches, achieving gains of at least 0.2–0.8% for images at 40×, 100×, 200×, and 400× magnification due to its capacity to collect multi-scale contextual information. DEEP_Pachi demonstrates that features derived from cross image inputs and then merged into a boosting framework outperform standard deep learning architectures in object classification tests. This also indicates that our enhancing approach exceeds deep learning networks when dealing with few training data samples.

4.1. Visualization the Influence of DEEP_Pachi Framework

To evaluate the influence of patches and embedding in the DEEP_Pachi model, an experiment was carried out utilizing the malignant image with 200× magnification as shown in Figure 9. The input image (a) was first split into patches as shown in (b) before the positional embedding (c) is added. By combining the pixel layers in a patch and then immensely extending it to the suitable input dimension, each patch is squeezed into a vector representation. Positional embedding (c) demonstrates how the model understands when to encrypt distance within the input image in the comparability of position embeddings, i.e., relatively close patches have much more position similar embeddings. The reason for the patches and the learnable embeddings is to treat each patch separately for an accurate feature extraction. The positional embedding helps the model to know where each patch was at the initial input during the output. The patches are first converted using 2D learnable convolutions. Furthermore, to analyze the impact of the patch and embedding combination, (d) validates the envisaged approach’s efficacy in improving prospective ROIs; this enalbes the model in efficiently and successfully concentrating on these areas and for determining the cancer.
Figure 9d shows how the self-attention heads enable DEEP_Pachi to generalize across the input frame, even within the minimum layers. According to the diagram, the total distance in input images in which relevant data are assimilated is comparable to receptive scale factor in CNNs and is highly recognized in our model due to our network backbone, which is an ensemble of DenseNet201 and VGG16; thus, we observed continuously small attention scales in small layers. Implementing the DEEP_Pachi model without a network backbone, i.e., generating features from scratch, causes the attention heads to focus on the majority of the image in the lowest layers, demonstrating that the model’s potential to consolidate information globally really is used. Furthermore, as the network depth increases, so does attention proximity. We discover that the model focuses on visual features that are semantic information significant for classification, as depicted in Figure 10.

4.2. Comparison with the State-of-the-Art Results

This section discusses the proposed model results vs. the state-of-the-art results. The result is illustrated in Table 10. The state-of-the-art models can be seen in two approaches—single models and ensemble models. Ensemble modeling is the most general approach, as seen in Table 10. Refs. [98,99] experimented with several deep learning models as feature extractors while using conventional machine learning algorithms (SVM and LR) as classifiers. However, the results were not as promising as the recorded results are below 90%. Among well-known Deep learning models, DenseNet and Xception architectures are preferred over the other models. They tend to yield classification accuracies above 90%, as recorded in Refs. [77,100,101] suggested that extracting breast cancer features using different feature extractors boosts models’ classification performance. They employed the Shearlet-based features extractor and histogram-based features extractor. For their final models, they concatenated the output features and achieved better performance compared to single feature extractors. They performed a +5–8% accuracy improvement in all magnifications of the BreaKHis dataset Ref [102], although the result is not promising, and using Data augmentation for better performance is suggested. They carried out a binary classification of the BreaKHis dataset and a multiclass classification using 400x magnification. Among their employed data augmentation techniques, GAN-based DA yielded 77.3% accuracy for binary classification while yielding 78.5% multiclass classification performance. Comparing the performance of the inception models, Inception_V3 and Inception_ResNet_V2 [93] produced a better performance as they extracted more relevant information by running convolution operations with varied regions of interest concurrently. The use of transfer learning is more evident in binary classification. The authors of Refs. [103,104,105,106,107] based their work on binary classification by combining the subclasses of the benign and the malignant. VGG is seen to be often used for feature extraction as it has deeper layers able to identify conceptual features. Comparing our proposed model DEEP_Pachi, which is a modification of the vison transformer self-attention heads computation techniques, ensemble models, and a classification layer using the Multilinear perceptron block, we argue that extracting increased breast cancer features requires an accurate vision system and, hence, and attention mechanism to focus on the region of the disease instead of extracting entire image features. Refs. [108,109,110,111] proposed an accurate and more unique approach for breast cancer classification. Ref. [108] employed the use of multi-view attention mechanism. Ref. [109] proposed the deep attention high order network, while Ref [110] proposed using a different branch of CNN for more feature generation. Ref [111] proposed a three-channel feature low dimension model. All these approaches were in line with better breast cancer feature extraction; thus, they achieved the highest classification performance with +95% classification accuracy on all magnifications of BreaKHis (40×, 100×, 200×, and 400× magnification). In line with the current state-of-the-art results, our model achieved an accuracy of 99% for all magnifications except 400%, where we achieved an accuracy of 1.0%. Our analyses demonstrate that our proposed models significantly enhanced the efficiency of the BC classifier. Our models can extract more critical breast cell features than CNN. CNN was made up of four thin convolution layers, which were insufficient for extracting unique properties of BC tumors, which was a difficult task due to the large variation of H&E smears.
The proposed model was also evaluated using the ICIAR 2018 breast cancer Histology images used for the BACH Grand challenge [123]. This dataset has 400 images while having 100 images per class. The classes of the dataset are Normal, Benign, In situ carcinoma, and Invasive carcinoma. This paper first augmented the dataset following the same principle of augmentation used for the BreaKHis data implemented. Table 11 summarizes the result attend with that of the state-of-art results. The use of the ensemble model is very evident in the compared models. Our proposed model supersedes the accuracy of the compared models, showing our model’s superiority.

5. Conclusions

To tackle the extraction of irrelevant features by conventional deep learning models, which results in the model’s misclassification and prediction, this paper proposed the DEEP_Pachi framework based on ensemble model, multiple self-attention heads, and multilinear perceptron for an accurate breast cancer histological image classification. First, a thorough review of medical image modalities for breast cancer classification was carried out with the related open access datasets. Secondly, we applied the Python augmentation library to address the issues of limited raining data samples. The Python Augmentor was used to generate the training image samples while utilizing the original image for testing. The proposed model utilizes ensemble model (Densenet201 and VGG16) as the network backbone for a more generalized feature extraction of the input images (global features), whereas multiple self-attention heads extract spatial information (regions of interest). The superiority of the proposed model was evaluated using two publicly available databases, BreakHis and ICIAR2018, and using various evaluations metrics, and the result obtained show that the proposed DEEP_Pachi outperforms the state-of-the-art results in histopathological breast cancer image classification. The suggested technique achieved an accuracy of 1.0 for the benign class and 0.99 for the malignant class in all magnifications of the BreakHis datasets and an accuracy of 0.99 on the ICIAR 2018 Challenge dataset.
As much as the proposed framework exhibit high classification accuracy, there is still room to evaluate DEEP_Pachi using other data augmentation techniques. Future work will see the exploration of various data augmentation techniques such as GAN for increasing training samples. We also intend on extending the DEEP_Pachi framework to other disease classification using histopathological or microscopic images such as Oral cancer, Skin Cancer, etc. On the other hand, this paper will investigate the replacement of the MLP Block with SGTM neural-like structures to evaluate the possible best approach in our model.

Author Contributions

Formal analysis, G.U.N.; funding acquisition, M.A.H.; methodology, C.C.U.; project administration, H.N.M.; resources, M.A.H.; supervision, Z.Q.; validation, G.U.N.; visualization, H.N.M.; writing—original draft, C.C.U.; writing—review and editing, J.K.J. All authors have read and agreed to the published version of the manuscript.

Funding

This research was partially supported by the National Science Foundation of China (NSFC) under the project “Development of fetal heart-oriented heart sound echocardiography multimodal auxiliary diagnostic equipment” (62027827).

Institutional Review Board Statement

This article does not contain any studies with human participants or animals performed by any of the authors.

Informed Consent Statement

Not applicable.

Data Availability Statement

The dataset used in this paper is public and can be obtained from these repositories: https://www.kaggle.com/ambarish/breakhis (accessed: 12 March 2022) and https://iciar2018-challenge.grand-challenge.org/Dataset/ (accessed: 12 March 2022). The TensorFlow/Keras code we used in our experiment is not yet publicly available and will be made so after the publication of the work.

Conflicts of Interest

All authors declare that they have no conflict of interest.

References

  1. Anastasiadi, Z.; Lianos, G.D.; Ignatiadou, E.; Harissis, H.V.; Mitsis, M. Breast Cancer in Young Women: An Overview. Updates Surg. 2017, 69, 313–317. [Google Scholar] [CrossRef] [PubMed]
  2. Wang, K.; Franch-Expósito, S.; Li, L.; Xiang, T.; Wu, J.; Ren, G. 34P Comprehensive clinical and molecular portraits of grade 3 ER+ HER- breast cancer. Ann. Oncol. 2020, 31, S27. [Google Scholar] [CrossRef]
  3. DeSantis, C.E.; Ma, J.; Gaudet, M.M.; Newman, L.A.; Miller, K.D.; Sauer, A.G.; Jemal, A.; Siegel, R.L. Breast Cancer Statistics. CA Cancer J. Clin. 2019, 69, 438–451. [Google Scholar] [CrossRef] [PubMed]
  4. Man, R.; Yang, P.; Xu, B. Classification of Breast Cancer Histopathological Images Using Discriminative Patches Screened by Generative Adversarial Networks. IEEE Access 2020, 8, 155362–155377. [Google Scholar] [CrossRef]
  5. Mambou, S.; Maresova, P.; Krejcar, O.; Selamat, A.; Kuca, K. Breast Cancer Detection Using Infrared Thermal Imaging and a Deep Learning Model. Sensors 2018, 18, 2799. [Google Scholar] [CrossRef] [Green Version]
  6. Mahmood, T.; Li, J.; Pei, Y.; Akhtar, F.; Imran, A.; Rehman, K.U. A Brief Survey on Breast Cancer Diagnostic with Deep Learning Schemes Using Multi-Image Modalities. IEEE Access 2020, 8, 165779–165809. [Google Scholar] [CrossRef]
  7. Chiao, J.Y.; Chen, K.-Y.; Liao, K.Y.-K.; Hsieh, P.-H.; Zhang, G.; Huang, T.-C. Detection and Classification the Breast Tumors Using Mask R-CNN On Sonograms. Medicine 2019, 98, e15200. [Google Scholar] [CrossRef]
  8. Cruz-Roa, A.; Gilmore, H.; Basavanhally, A.; Feldman, M.; Ganesan, S.; Shih, N.N.C.; Tomaszewski, J.; González, F.A.; Madabhushi, A. Accurate and Reproducible Invasive Breast Cancer Detection in Whole-Slide Images: A Deep Learning Approach for Quantifying Tumor Extent. Sci. Rep. 2017, 7, 46450. [Google Scholar] [CrossRef] [Green Version]
  9. Talbert, P.Y.; Frazier, M.D. Inflammatory Breast Cancer Disease: A Literature Review. Cancer Stud. 2019, 2. [Google Scholar] [CrossRef]
  10. Saha, M.; Chakraborty, C.; Racoceanu, D. Efficient Deep Learning Model for Mitosis Detection Using Breast Histopathology Images. Comput. Med. Imaging Graph. 2018, 64, 29–40. [Google Scholar]
  11. Domingues, I.; Pereira, G.; Martins, P.; Duarte, H.; Santos, J.; Abreu, P.H. Using Deep Learning Techniques in Medical Imaging: A Systematic Review of Applications on CT And PET. Artif. Intell. Rev. 2019, 53, 4093–4160. [Google Scholar] [CrossRef]
  12. Murtaza, G.; Shuib, L.; Wahab, A.W.A.; Mujtaba, G.; Mujtaba, G.; Nweke, H.F.; Al-garadi, M.A.; Zulfiqar, F.; Raza, G.; Azmi, N.A. Deep Learning-Based Breast Cancer Classification Through Medical Imaging Modalities: State of The Art and Research Challenges. Artif. Intell. Rev. 2019, 53, 1655–1720. [Google Scholar] [CrossRef]
  13. Pavithra, S.; Vanithamani, R.; Justin, J. Computer-aided breast cancer detection using ultrasound images. Mater. Today Proc. 2020, 33, 4802–4807. [Google Scholar] [CrossRef]
  14. Moghbel, M.; Ooi, C.Y.; Ismail, N.; Hau, Y.W.; Memari, N. A Review of Breast Boundary and Pectoral Muscle Segmentation Methods in Computer-Aided Detection/Diagnosis of Breast Mammography. Artif. Intell. Rev. 2019, 53, 1873–1918. [Google Scholar] [CrossRef]
  15. Prabha, S. Thermal Imaging Techniques for Breast Screening—A Survey. Curr. Med. Imaging 2020, 16, 855–862. [Google Scholar]
  16. Hadadi, I.; Rae, W.; Clarke, J.; McEntee, M.; Ekpo, E. Diagnostic Performance of Adjunctive Imaging Modalities Compared to Mammography Alone in Women with Non-Dense and Dense Breasts: A Systematic Review and Meta-Analysis. Clin. Breast Cancer 2021, 21, 278–291. [Google Scholar] [CrossRef] [PubMed]
  17. Nahid, A.A.; Ali, F.B.; Kong, Y. Histopathological Breast-Image Classification with Image Enhancement by Convolutional Neural Network. In Proceedings of the 2017 20th International Conference of Computer and Information Technology (ICCIT), Dhaka, Bangladesh, 22–24 December 2017. [Google Scholar]
  18. Bardou, D.; Zhang, K.; Ahmad, S.M. Classification of Breast Cancer Based on Histology Images Using Convolutional Neural Networks. IEEE Access 2018, 6, 24680–24693. [Google Scholar] [CrossRef]
  19. Araújo, T.; Aresta, G.; Castro, E.; Rouco, J.; Aguiar, P.; Eloy, C.; Polónia, A.; Campilho, A. Classification of Breast Cancer Histology Images Using Convolutional Neural Networks. PLoS ONE 2017, 12, e0177544. [Google Scholar] [CrossRef]
  20. Roy, K.; Banik, D.; Bhattacharjee, D.; Nasipuri, M. Patch-Based System for Classification of Breast Histology Images Using Deep Learning. Comput. Med. Imaging Graph. 2019, 71, 90–103. [Google Scholar] [CrossRef]
  21. Kausar, T.; Wang, M.; Malik, M.S.S. Cancer Detection in Breast Histopathology with Convolution Neural Network Based Approach. In Proceedings of the 2019 IEEE/ACS 16th International Conference on Computer Systems and Applications (AICCSA), Abu Dhabi, United Arab Emirates, 3–7 November 2019. [Google Scholar]
  22. Albawi, S.; Mohammed, T.A.; Al-Zawi, S. Understanding of A Convolutional Neural Network. In Proceedings of the 2017 International Conference on Engineering and Technology (ICET), Antalya, Turkey, 21–23 August 2017. [Google Scholar]
  23. Perumal, V.; Narayanan, V.; Rajasekar, S.J.S. Detection of Brain Tumor with Magnetic Resonance Imaging using Deep Learning Techniques. In Brain Tumor MRI Image Segmentation Using Deep Learning Techniques; Elsevier: Amsterdam, The Netherlands, 2022; pp. 183–196. [Google Scholar]
  24. Hu, J.; Shen, L.; Sun, G. Squeeze-And-Excitation Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7132–7141. [Google Scholar]
  25. Huang, Z.; Wang, X.; Huang, L.; Huang, C.; Wei, Y.; Liu, W. CCNET: Criss-Cross Attention for Semantic Segmentation. In Proceedings of the IEEE/CVF International Conference on Computer Vision, Seoul, Korea, 27–28 October 2019; pp. 603–612. [Google Scholar]
  26. Wang, F.; Jiang, M.; Qian, C.; Yang, S.; Li, C.; Zhang, H.; Wang, X.; Tang, X. Residual Attention Network for Image Classification. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Honolulu, HI, USA, 21–26 July 2017; pp. 3156–3164. [Google Scholar]
  27. Wang, X.; Girshick, R.; Gupta, A.; He, K. Non-Local Neural Networks. In Proceedings of the IEEE Conference on Computer Vision and Pattern Recognition, Salt Lake City, UT, USA, 18–23 June 2018; pp. 7794–7803. [Google Scholar]
  28. Woo, S.; Park, J.; Lee, J.-Y.; Kweon, I.S. CBAM: Convolutional Block Attention Module. In Proceedings of the European Conference on Computer Vision (ECCV), Munich, Germany, 8–14 September 2018; pp. 3–19. [Google Scholar]
  29. Sarikaya, I. Breast Cancer and Pet Imaging. Nucl. Med. Rev. Cent. East. Eur. 2021, 24, 16–26. [Google Scholar] [CrossRef]
  30. Vaishnavi, J.; Devi, M.A.; Punitha, S.; Ravi, S. Computer-aided mammography techniques for detection and classification of microcalcifications in digital mammograms. Int. J. Image Min. 2018, 3, 48. [Google Scholar] [CrossRef]
  31. Loizidou, K.; Skouroumouni, G.; Nikolaou, C.; Pitris, C. An Automated Breast Micro-Calcification Detection and Classification Technique Using Temporal Subtraction of Mammograms. IEEE Access 2020, 8, 52785–52795. [Google Scholar] [CrossRef]
  32. Suh, Y.J.; Jung, J.; Cho, B.-J. Automated Breast Cancer Detection in Digital Mammograms of Various Densities Via Deep Learning. J. Pers. Med. 2020, 10, 211. [Google Scholar] [CrossRef] [PubMed]
  33. Mohamed, A.A.; Berg, W.A.; Peng, H.; Luo, Y.; Jankowitz, R.C.; Wu, S. A Deep Learning Method for Classifying Mammographic Breast Density Categories. Med. Phys. 2018, 45, 314–321. [Google Scholar] [CrossRef] [PubMed]
  34. Mehmood, M.; Ayub, E.; Ahmad, F.; Alruwaili, M.; Alrowaili, Z.A.; Alanazi, S.; Humayun, M.; Rizwan, M.; Naseem, S.; Alyas, T. Machine Learning Enabled Early Detection of Breast Cancer by Structural Analysis of Mammograms. Comput. Mater. Contin. 2021, 67, 641–657. [Google Scholar] [CrossRef]
  35. Fiorica, J.V. Breast Cancer Screening, Mammography, And Other Modalities. Clin. Obstet. Gynecol. 2017, 59, 688–709. [Google Scholar] [CrossRef]
  36. Li, Q.; Shi, W.; Yang, H.; Zhang, H.; Li, G.; Chen, T.; Mori, K.; Jiang, Z. Computer-aided diagnosis of mammographic masses using geometric verification-based image retrieval. Med. Imaging 2017 Comput. Aided Diagn. 2017, 10134, 746–753. [Google Scholar]
  37. Kaur, P.; Singh, G.; Kaur, P. Intellectual detection and validation of automated mammogram breast cancer images by multi-class SVM using deep learning classification. Inform. Med. Unlocked 2019, 16, 100151. [Google Scholar] [CrossRef]
  38. Tran, T.S.H.; Nguyen, H.M.T. Application of 2D Ultrasound, Elastography Arfi and Mammography for Diagnosis of solid tumors in breast. J. Med. Pharm. 2019, 58–65. [Google Scholar] [CrossRef]
  39. Han, J.; Li, F.; Peng, C.; Huang, Y.; Lin, Q.; Liu, Y.; Cao, L.; Zhou, J. Reducing Unnecessary Biopsy of Breast Lesions: Preliminary Results with Combination of Strain and Shear-Wave Elastography. Ultrasound Med. Biol. 2019, 45, 2317–2327. [Google Scholar] [CrossRef]
  40. Ucar, H.; Kacar, E.; Karaca, R. The Contribution of a Solid Breast Mass Gray-Scale Histographic Analysis in Ascertaining a Benign-Malignant Differentiation. J. Diagn. Med. Sonogr. 2022, 875647932210782. [Google Scholar] [CrossRef]
  41. Yap, M.H.; Pons, G.; Marti, J.; Ganau, S.; Sentis, M.; Zwiggelaar, R.; Davison, A.K.; Marti, R. Automated Breast Ultrasound Lesions Detection Using Convolutional Neural Networks. IEEE J. Biomed. Health Inf. 2017, 22, 1218–1226. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  42. Early Initiation of MRI-Based Breast Cancer Screening Predicted to Halve Breast Cancer Deaths in Childhood Cancer Survivor. Default Digital Object Group 2019. [CrossRef]
  43. Sriussadaporn, S.; Sriussadaporn, S.; Pak-art, R.; Kritayakirana, K.; Prichayudh, S.; Samorn, P. Ultrasonography increases sensitivity of mammography for diagnosis of multifocal, multicentric breast cancer using 356 whole breast histopathology as a gold standard. Surg. Pract. 2022. [Google Scholar] [CrossRef]
  44. Pujara, A.C.; Kim, E.; Axelrod, D.; Melsaether, A.N. PET/MRI in Breast Cancer. J. Magn. Reson. Imaging 2018, 49, 328–342. [Google Scholar] [CrossRef] [PubMed]
  45. Mann, R.M.; Athanasiou, A.; Baltzer, P.A.T.; Camps-Herrero, J.; Clauser, P.; Fallenberg, E.M.; Forrai, G.; Fuchsjäger, M.H.; Helbich, T.H.; Killburn-Toppin, F.; et al. Breast cancer screening in women with extremely dense breasts recommendations of the European Society of Breast Imaging (EUSOBI). Eur. Radiol. 2022, 1–10. [Google Scholar] [CrossRef]
  46. Houssami, N.; Cho, N. Screening Women with A Personal History of Breast Cancer: Overview of The Evidence on Breast Imaging Surveillance. Ultrasonography 2018, 37, 277. [Google Scholar] [CrossRef]
  47. Greenwood, H.I. Abbreviated Protocol Breast MRI: The Past, Present, And Future. Clin. Imaging 2019, 53, 169–173. [Google Scholar] [CrossRef]
  48. Zelst, J.C.V.; Vreemann, S.; Witt, H.-J.; Gubern-Merida, A.; Dorrius, M.D.; Duvivier, K.; Lardenoije-Broker, S.; Lobbes, M.B.; Loo, C.; Veldhuis, W.; et al. Multireader Study on The Diagnostic Accuracy of Ultrafast Breast Magnetic Resonance Imaging for Breast Cancer Screening. Investig. Radiol. 2018, 53, 579–586. [Google Scholar] [CrossRef]
  49. Heller, S.L.; Moy, L. MRI Breast Screening Revisited. J. Magn. Reson. Imaging 2019, 49, 1212–1221. [Google Scholar] [CrossRef]
  50. Aswathy, M.; Jagannath, M. Detection of Breast Cancer On Digital Histopathology Images: Present Status And Future Possibilities. Inf. Med. Unlocked 2017, 8, 74–79. [Google Scholar] [CrossRef] [Green Version]
  51. Tellez, D.; Balkenhol, M.; Karssemeijer, N.; Litjens, G.; Laak, J.V.D.; Ciompi, F. H and E Stain Augmentation Improves Generalization of Convolutional Networks for Histopathological Mitosis Detection. In Medical Imaging 2018: Digital Pathology; International Society for Optics and Photonics: Bellingham, WA, USA, 2018; p. 105810Z. [Google Scholar]
  52. Jaglan, P.; Dass, R.; Duhan, M. Breast Cancer Detection Techniques: Issues and Challenges. J. Inst. Eng. Ser. B 2019, 100, 379–386. [Google Scholar] [CrossRef]
  53. Posso, M.; Puig, T.; Carles, M.; Ru’e, M.; Canelo-Aybar, C.; Bonfill, X. Effectiveness and Cost-Effectiveness of Double Reading in Digital Mammography Screening: A Systematic Review and Meta-Analysis. Eur. J. Radiol. 2017, 96, 40–49. [Google Scholar] [CrossRef] [Green Version]
  54. Wilkinson, L.; Thomas, V.; Sharma, N. Microcalcification On Mammography: Approaches to Interpretation and Biopsy. Br. J. Radiol. 2017, 90, 20160594. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  55. Houssami, N. Evidence on Synthesized Two-dimensional Mammography Versus Digital Mammography When Using Tomosynthesis (Three-dimensional Mammography) for Population Breast Cancer Screening. Clin. Breast Cancer 2018, 18, 255–260.e1. [Google Scholar] [CrossRef] [PubMed]
  56. Fujimura, S.; Tamura, T.; Kawasaki, Y. The relationship between compressed breast thickness in mammography and other factors that influence breast cancer. J. Jpn. Assoc. Breast Cancer Screen. 2021, 30, 177–181. [Google Scholar] [CrossRef]
  57. Rapelyea, J.A.; Marks, C.G. Breast Ultrasound Past, Present, And Future. In Breast Imaging; Intech Open: Rijeka, Croatia, 2018; pp. 21–48. [Google Scholar]
  58. Sood, R.; Rositch, A.F.; Shakoor, D.; Ambinder, E.; Pool, K.-L.; Pollack, E.; Mollura, D.J.; Mullen, L.A.; Harvey, S.C. Ultrasound for Breast Cancer Detection Globally: A Systematic Review and Meta-Analysis. J. Global. Oncol. 2019, 5, 1–17. [Google Scholar] [CrossRef]
  59. Youk, J.H.; Gweon, H.M.; Son, E.J. Shear-Wave Elastography in Breast Ultrasonography: The State of the Art. Ultrasonography 2017, 36, 300. [Google Scholar] [CrossRef] [Green Version]
  60. Radiological Society of North America. Ultrasound Images. Available online: https://www.radiologyinfo.org/en/info/genus (accessed on 22 March 2022).
  61. García, E.; Diez, Y.; Diaz, O.; Llado, X.; Martí, R.; Martí, J.; Oliver, A. A Step-By-Step Review on Patient-Specific Biomechanical Finite Element Models for Breast M.R.I. To X-Ray Mammography Registration. Med. Phys. 2018, 45, e6–e31. [Google Scholar] [CrossRef] [Green Version]
  62. Kalantarova, A.; Zembol, N.J.; Kufel-Grabowska, J. Pregnancy-Associated Breast Cancer as A Screening and Diagnostic Challenge: A Case Report. Nowotwory 2021, 71, 162–164. [Google Scholar] [CrossRef]
  63. Reig, B.; Heacock, L.; Geras, K.J.; Moy, L. Machine Learning in Breast MRI. J. Magn. Reson. Imaging 2020, 52, 998–1018. [Google Scholar] [CrossRef] [PubMed]
  64. Kumar, A.; Singh, S.K.; Saxena, S.; Lakshmanan, K.; Sangaiah, A.K.; Chauhan, H.; Shrivastava, S.; Singh, R.K. Deep Feature Learning for Histopathological Image Classification of Canine Mammary Tumors and Human Breast Cancer. Inf. Sci. 2020, 508, 405–421. [Google Scholar] [CrossRef]
  65. Beevi, K.S.; Nair, M.S.; Bindu, G.R. Automatic Mitosis Detection In Breast Histopathology Images Using Convolutional Neural Network Based Deep Transfer Learning. Biocybern. Biomed. Eng. 2019, 39, 214–223. [Google Scholar] [CrossRef]
  66. Dodballapur, V.; Song, Y.; Huang, H.; Chen, M.; Chrzanowski, W.; Cai, W. Mask-Driven Mitosis Detection in Histopathology Images. In Proceedings of the 2019 IEEE 16th International Symposium on Biomedical Imaging (ISBI 2019), Venice, Italy, 8–11 April 2019. [Google Scholar]
  67. Wang, Y.; Lei, B.; Elazab, A.; Tan, E.-L.; Wang, W.; Huang, F.; Gong, X.; Wang, T. Breast Cancer Image Classification via Multi-Network Features and Dual-Network Orthogonal Low-Rank Learning. IEEE Access 2020, 8, 27779–27792. [Google Scholar] [CrossRef]
  68. Das, A.; Nair, M.S.; Peter, S.D. Sparse Representation Over Learned Dictionaries on the Riemannian Manifold for Automated Grading of Nuclear Pleomorphism in Breast Cancer. IEEE Trans. Image Process. 2019, 28, 1248–1260. [Google Scholar] [CrossRef]
  69. Dimitropoulos, K.; Barmpoutis, P.; Zioga, C.; Kamas, A.; Patsiaoura, K.; Grammalidis, N. Grading of Invasive Breast Carcinoma Through Grassmannian VLAD Encoding. PLoS ONE 2017, 12, e0185110. [Google Scholar] [CrossRef] [Green Version]
  70. Zheng, Y.; Jiang, Z.; Zhang, H.; Xie, F.; Ma, Y.; Shi, H.; Zhao, Y. Histopathological Whole Slide Image Analysis Using Context-Based CBIR. IEEE Trans. Med. Imaging 2018, 37, 1641–1652. [Google Scholar] [CrossRef]
  71. Biswas, R.; Roy, S.; Biswas, A. Mammogram Classification using Curvelet Coefficients and Gray Level Co-Occurrence Matrix for Detection of Breast Cancer. Int. J. Innov. Technol. Explor. Eng. 2019, 8, 4819–4824. [Google Scholar]
  72. Reis, S.; Gazinska, P.; Hipwell, J.H.; Mertzanidou, T.; Naidoo, K.; Williams, N.; Pinder, S.; Hawkes, D.J. Automated Classification of Breast Cancer Stroma Maturity from Histological Images. IEEE Trans. Biomed. Eng. 2017, 64, 2344–2352. [Google Scholar] [CrossRef]
  73. Vaiyapuri, S.; Mari, K.; Delphina, A.A. An Intelligent Framework for Detection and Classification of MRI Brain Tumour using SIFT-SURF Features and K-nearest Neighbour Approach. Strad Res. 2020, 7, 1–10. [Google Scholar]
  74. Krystel-Whittemore, M.; Wen, H.Y. Update on HER2 expression in breast cancer. Diagn. Histopathol. 2022, 28, 170–175. [Google Scholar] [CrossRef]
  75. Nateghi, R.; Danyali, H.; Helfroush, M.S. Maximized Inter-Class Weighted Mean for Fast and Accurate Mitosis Cells Detection in Breast Cancer Histopathology Images. J. Med. Syst. 2017, 41, 146. [Google Scholar] [CrossRef] [PubMed]
  76. Burçak, K.C.; Baykan, Ö.K.; Uğuz, H. A New Deep Convolutional Neural Network Model for Classifying Breast Cancer Histopathological Images and The Hyperparameter Optimization of The Proposed Model. J. Supercomput. 2020, 77, 973–989. [Google Scholar] [CrossRef]
  77. Li, L.; Pan, X.; Yang, H.; Liu, Z.; He, Y.; Li, Z.; Fan, Y.; Cao, Z.; Zhang, L. Multi-Task Deep Learning for Fine-Grained Classification and Grading in Breast Cancer Histopathological Images. Multimed. Tools Appl. 2018, 79, 14509–14528. [Google Scholar] [CrossRef]
  78. Gour, M.; Jain, S.; Kumar, T.S. Residual Learning-Based CNN For Breast Cancer Histopathological Image Classification. Int. J. Imaging Syst. Technol. 2020, 30, 621–635. [Google Scholar] [CrossRef]
  79. Yan, R.; Ren, F.; Wang, Z.; Wang, L.; Zhang, T.; Liu, Y.; Rao, X.; Zheng, C.; Zhang, F. Breast Cancer Histopathological Image Classification Using A Hybrid Deep Neural Network. Methods 2020, 173, 52–60. [Google Scholar] [CrossRef] [PubMed]
  80. Jiang, Y.; Chen, L.; Zhang, H.; Xiao, X. Breast Cancer Histopathological Image Classification Using Convolutional Neural Networks with Small SE-Resnet Module. PLoS ONE 2019, 14, e0214587. [Google Scholar] [CrossRef] [Green Version]
  81. Yao, H.; Zhang, X.; Zhou, X.; Liu, S. Parallel Structure Deep Neural Network Using CNN and RNN with an Attention Mechanism for Breast Cancer Histology Image Classification. Cancers 2019, 11, 1901. [Google Scholar] [CrossRef] [Green Version]
  82. Khan, S.; Islam, N.; Jan, Z.; Din, I.U.; Rodrigues, J.J.P.C. A Novel Deep Learning-Based Framework for The Detection and Classification of Breast Cancer Using Transfer Learning. Pattern Recognit. Lett. 2019, 125, 1–6. [Google Scholar] [CrossRef]
  83. Du, Y.; Zhang, R.; Zargari, A.; Thai, T.C.; Gunderson, C.C.; Moxley, K.M.; Liu, H.; Zheng, B.; Qiu, Y. Classification of Tumor Epithelium and Stroma by Exploiting Image Features Learned by Deep Convolutional Neural Networks. Ann. Biomed. Eng. 2018, 46, 1988–1999. [Google Scholar] [CrossRef]
  84. Wang, P.; Song, Q.; Li, Y.; Lv, S.; Wang, J.; Li, L.; Zhang, H. Cross-Task Extreme Learning Machine for Breast Cancer Image Classification with Deep Convolutional Features. Biomed. Signal Process. Control 2020, 57, 101789. [Google Scholar] [CrossRef]
  85. Xie, J.; Liu, R.; Luttrell, J.; Zhang, C. Deep Learning-Based Analysis of Histopathological Images of Breast Cancer. Front. Genet. 2019, 10, 80. [Google Scholar] [CrossRef] [Green Version]
  86. Kausar, T.; Wang, M.; Idrees, M.; Lu, Y. HWDCNN: Multi-Class Recognition in Breast Histopathology with HAAR Wavelet Decomposed Image-Based Convolution Neural Network. Biocybern. Biomed. Eng. 2019, 39, 967–982. [Google Scholar] [CrossRef]
  87. Yang, H.; Kim, J.-Y.; Kim, H.; Adhikari, S.P. Guided Soft Attention Network for Classification of Breast Cancer Histopathology Images. IEEE Trans. Med. Imaging 2020, 39, 1306–1315. [Google Scholar] [CrossRef] [PubMed]
  88. Han, Z.; Wei, B.; Zheng, Y.; Yin, Y.; Li, K.; Li, S. Breast Cancer Multi-classification from Histopathological Images with Structured Deep Learning Model. Sci. Rep. 2017, 7, 4172. [Google Scholar]
  89. Nazeri, K.; Aminpour, A.; Ebrahimi, M. Two-Stage Convolutional Neural Network for Breast Cancer Histology Image Classification. Image Anal. Recognit. 2018, 10882, 717–726. [Google Scholar]
  90. Xu, B.; Liu, J.; Hou, X.; Liu, B.; Garibaldi, J.; Ellis, I.O.; Green, A.; Shen, L.; Qiu, G. Attention by Selection: A Deep Selective Attention Approach to Breast Cancer Classification. IEEE Trans. Med. Imaging 2020, 39, 1930–1941. [Google Scholar] [CrossRef]
  91. Shallu; Mehra, R. Breast Cancer Histology Images Classification: Training from Scratch or Transfer Learning? ICT Express 2018, 4, 247–254. [Google Scholar] [CrossRef]
  92. Wan, T.; Cao, J.; Chen, J.; Qin, Z. Automated Grading of Breast Cancer Histopathology Using Cascaded Ensemble with Combination of Multi-Level Image Features. Neurocomputing 2017, 229, 34–44. [Google Scholar] [CrossRef]
  93. Saxena, S.; Shukla, S.; Gyanchandani, M. Pre-Trained Convolutional Neural Networks as Feature Extractors for Diagnosis of Breast Cancer Using Histopathology. Int. J. Imaging Syst. Technol. 2020, 30, 577–591. [Google Scholar] [CrossRef]
  94. Sharma, S.; Mehra, R. Conventional Machine Learning and Deep Learning Approach for Multi-Classification of Breast Cancer Histopathology Images—A Comparative Insight. J. Digit. Imaging 2020, 33, 632–654. [Google Scholar] [CrossRef] [PubMed]
  95. Zhu, C.; Song, F.; Wang, Y.; Dong, H.; Guo, Y.; Liu, J. Breast Cancer Histopathology Image Classification Through Assembling Multiple Compact CNNS. BMC Med. Inform. Decis. Mak. 2019, 19, 198. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  96. Karen, S.; Zisserman, A. Very Deep Convolutional Networks for Large-Scale Image Recognition. arXiv 2015, arXiv:1409.1556. [Google Scholar]
  97. Huang, G.; Liu, Z.; Maaten, L.V.D.; Weinberger, K.Q. Densely Connected Convolutional Networks. In Proceedings of the 2017 IEEE Conference on Computer Vision and Pattern Recognition (CVPR), Honolulu, HI, USA, 21–26 July 2017. [Google Scholar]
  98. Dosovitskiy, A.; Beyer, L.; Kolesnikov, A.; Weissenborn, D.; Zhai, X.; Unterthiner, T.; Dehghani, M.; Minderer, M.; Heigold, G.; Gelly, S.; et al. An image is worth 16 × 16 words: Transformers for image recognition at scale. arXiv 2020, arXiv:2010.11929. [Google Scholar]
  99. Aresta, G.; Araújo, T.; Kwok, S.; Chennamsetty, S.S.; Safwan, M.; Alex, V.; Marami, B.; Prastawa, M.; Chan, M.; Donovan, M.; et al. BACH: Grand challenge on breast cancer histology images. Med. Image Anal. 2019, 56, 122–139. [Google Scholar] [CrossRef]
  100. George, K.; Faziludeen, S.; Sankaran, P. Breast cancer detection from biopsy images using nucleus guided transfer learning and belief-based fusion. Comput. Biol. Med. 2020, 124, 103954. [Google Scholar] [CrossRef]
  101. Kwok, S. Multiclass Classification of Breast Cancer in Whole-Slide Images. Image Anal. Recognit. 2018, 10882, 931–940. [Google Scholar]
  102. Wong, W.S.; Amer, M.; Maul, T.; Liao, I.Y.; Ahmed, A. Conditional Generative Adversarial Networks for Data Augmentation in Breast Cancer Classification. Recent Adv. Soft Comput. Data Min. 2019, 978, 392–402. [Google Scholar]
  103. Thuy, M.B.H.; Hoang, V.T. Fusing of Deep Learning, Transfer Learning and GAN for Breast Cancer Histopathological Image Classification. Adv. Intell. Syst. Comput. 2019, 1121, 255–266. [Google Scholar]
  104. Li, G.; Li, C.; Wu, G.; Ji, D.; Zhang, H. Multi-View Attention-Guided Multiple Instance Detection Network for Interpretable Breast Cancer Histopathological Image Diagnosis. IEEE Access 2021, 9, 79671–79684. [Google Scholar] [CrossRef]
  105. Boumaraf, S.; Liu, X.; Zheng, Z.; Ma, X.; Ferkous, C. A new transfer learning-based approach to magnification dependent and independent classification of breast cancer in histopathological images. Biomed. Signal Process. Control 2021, 63, 102192. [Google Scholar] [CrossRef]
  106. Gupta, K.; Chawla, N. Analysis of Histopathological Images for Prediction of Breast Cancer Using Traditional Classifiers with Pre-Trained CNN. Procedia Comput. Sci. 2020, 167, 878–889. [Google Scholar] [CrossRef]
  107. Budak, Ü.; Güzel, A.B. Automatic Grading System for Diagnosis of Breast Cancer Exploiting Co-occurrence Shearlet Transform and Histogram Features. IRBM 2020, 41, 106–114. [Google Scholar] [CrossRef]
  108. Zou, Y.; Zhang, J.; Huang, S.; Liu, B. Breast cancer histopathological image classification using attention high-order deep network. Int. J. Imaging Syst. Technol. 2021, 32, 266–279. [Google Scholar] [CrossRef]
  109. Ibraheem, A.M.; Rahouma, K.H.; Hamed, H.F.A. 3PCNNB-Net: Three Parallel CNN Branches for Breast Cancer Classification Through Histopathological Images. J. Med. Biol. Eng. 2021, 41, 494–503. [Google Scholar] [CrossRef]
  110. Liu, W.; Juhas, M.; Zhang, Y. Fine-Grained Breast Cancer Classification with Bilinear Convolutional Neural Networks (BCNNs). Front. Genet. 2020, 11, 1061. [Google Scholar] [CrossRef]
  111. Kashyap, R. Evolution of histopathological breast cancer images classification using stochastic dilated residual ghost model. Turk. J. Electr. Eng. Comput. Sci. 2021, 29, 2758–2779. [Google Scholar] [CrossRef]
  112. Nahid, A.A.; Mehrabi, M.A.; Kong, Y. Histopathological Breast Cancer Image Classification by Deep Neural Network Techniques Guided by Local Clustering. BioMed Res. Int. 2018, 2018, 2362108. [Google Scholar] [CrossRef]
  113. Nawaz, M.; Sewissy, A.A.; Hassan, T. Multi-Class Breast Cancer Classification using Deep Learning Convolutional Neural Network. Int. J. Adv. Comput. Sci. Appl. 2018, 9, 316–332. [Google Scholar] [CrossRef]
  114. Sanchez-Morillo, D.; González, J.; García-Rojo, M.; Ortega, J. Classification of Breast Cancer Histopathological Images Using KAZE Features. Lect. Notes Comput. Sci. 2018, 10814, 276–286. [Google Scholar]
  115. Zhang, X.; Zhang, Y.; Qian, B.; Liu, X.; Li, X.; Wang, X.; Yin, C.; Lv, X.; Song, L.; Wang, L. Classifying Breast Cancer Histopathological Images Using a Robust Artificial Neural Network Architecture. Lect. Notes Comput. Sci. 2019, 11465, 204–215. [Google Scholar]
  116. Alom, M.Z.; Yakopcic, C.; Nasrin, M.S.; Taha, T.M.; Asari, V.K. Breast Cancer Classification from Histopathological Images with Inception Recurrent Residual Convolutional Neural Network. J. Digit. Imaging 2019, 32, 605–617. [Google Scholar] [CrossRef] [PubMed] [Green Version]
  117. Spanhol, F.A.; Oliveira, L.S.; Petitjean, C.; Heutte, L. Breast Cancer Histopathological Image Classification using Deep Convolutional Neural Network. In Proceedings of the 2016 International Joint Conference on Neural Networks (IJCNN), Vancouver, BC, Canada, 24–29 July 2016; pp. 2560–2567. [Google Scholar]
  118. Mewada, H.K.; Patel, A.V.; Hassaballah, M.; Alkinani, M.H.; Mahant, K. Spectral–Spatial Features Integrated Convolution Neural Network for Breast Cancer Classification. Sensors 2020, 20, 4747. [Google Scholar] [CrossRef] [PubMed]
  119. Saxena, S.; Shukla, S.; Gyanchandani, M. Breast cancer histopathology image classification using kernelized weighted extreme learning machine. Int. J. Imaging Syst. Technol. 2020, 31, 168–179. [Google Scholar] [CrossRef]
  120. Sharma, S.; Mehra, R.; Kumar, S. Optimised CNN in conjunction with efficient pooling strategy for the multi-classification of breast cancer. IET Image Process. 2020, 15, 936–946. [Google Scholar] [CrossRef]
  121. Hao, Y.; Qiao, S.; Zhang, L.; Xu, T.; Bai, Y.; Hu, H.; Zhang, W.; Zhang, G. Breast Cancer Histopathological Images Recognition Based on Low Dimensional Three-Channel Features. Front. Oncol. 2021, 11, 657560. [Google Scholar] [CrossRef] [PubMed]
  122. Rashmi, R.; Prasad, K.; Udupa, C.B.K. BCHisto-Net: Breast histopathological image classification by global and local feature aggregation. Artif. Intell. Med. 2021, 121, 102191. [Google Scholar]
  123. Carvalho, E.D.; Filho, A.O.C.; Silva, R.R.V.; Araújo, F.H.D.; Diniz, J.O.B.; Silva, A.C.; Paiva, A.C.; Gattass, M. Breast cancer diagnosis from histopathological images using textural features and CBIR. Artif. Intell. Med. 2020, 105, 101845. [Google Scholar] [CrossRef]
  124. Pimkin, A.; Makarchuk, G.; Kondratenko, V.; Pisov, M.; Krivov, E.; Belyaev, M. Ensembling Neural Networks for Digital Pathology Images Classification and Segmentation. Image Anal. Recognit. 2018, 10882, 877–886. [Google Scholar]
  125. Yang, Z.; Ran, L.; Zhang, S.; Xia, Y.; Zhang, Y. EMS-Net: Ensemble of Multiscale Convolutional Neural Networks for Classification of Breast Cancer Histology Images. Neurocomputing 2019, 366, 46–53. [Google Scholar] [CrossRef]
  126. Sitaula, C.; Aryal, S. Fusion of whole and part features for the classification of histopathological image of breast tissue. Health Inf. Sci. Syst. 2020, 8, 38. [Google Scholar] [CrossRef] [PubMed]
  127. Zhong, Y.; Piao, Y.; Zhang, G. Dilated and soft attention-guided convolutional neural network for breast cancer histology images classification. Microsc. Res. Tech. 2022, 85, 1248–1257. [Google Scholar] [CrossRef] [PubMed]
Figure 1. Proposed methodology block diagram.
Figure 1. Proposed methodology block diagram.
Diagnostics 12 01152 g001
Figure 2. Proposed model block diagram. (a) depicts the extraction of input image features via the backbone models (ensemble model). The Deep_Pachi networks accepts the extracted features in two scenarios (b) Patch embedding and (c) Position Embedding. (d) depicts DEEP_Pachi framework components which are the self-attention Network and MLP Layer. (e) depicts the testing stage with new images on the trained DEEP_Pachi Network.
Figure 2. Proposed model block diagram. (a) depicts the extraction of input image features via the backbone models (ensemble model). The Deep_Pachi networks accepts the extracted features in two scenarios (b) Patch embedding and (c) Position Embedding. (d) depicts DEEP_Pachi framework components which are the self-attention Network and MLP Layer. (e) depicts the testing stage with new images on the trained DEEP_Pachi Network.
Diagnostics 12 01152 g002
Figure 3. Visualization of the BreaKHis dataset.
Figure 3. Visualization of the BreaKHis dataset.
Diagnostics 12 01152 g003
Figure 4. Proposed network backbone architecture.
Figure 4. Proposed network backbone architecture.
Diagnostics 12 01152 g004
Figure 5. Visualization of DEEP_Pach architecture.
Figure 5. Visualization of DEEP_Pach architecture.
Diagnostics 12 01152 g005
Figure 6. Binary Classification between Benign and Malignant. (a) depicts the PR Curve using the 100×, (b) depicts PR Curve @400× (c) depicts ROC curve @ 100×, and (d) depicts ROC curve @ 400×.
Figure 6. Binary Classification between Benign and Malignant. (a) depicts the PR Curve using the 100×, (b) depicts PR Curve @400× (c) depicts ROC curve @ 100×, and (d) depicts ROC curve @ 400×.
Diagnostics 12 01152 g006aDiagnostics 12 01152 g006b
Figure 7. Benign individual class performance using Receiver Operating Characteristics (ROC) Curve and Precision–Recall (PR) Curve. (a) depicts the PR Curve @40×, (b) depicts PR Curve @100× (c) depicts PR Curve @ 200×, (d) depicts PR Curve @ 400×, (e) depicts ROC curve @ 40×, (f) depicts ROC curve @ 100×, (g) depicts ROC curve @ 200×, and (h) depicts ROC curve @ 400×.
Figure 7. Benign individual class performance using Receiver Operating Characteristics (ROC) Curve and Precision–Recall (PR) Curve. (a) depicts the PR Curve @40×, (b) depicts PR Curve @100× (c) depicts PR Curve @ 200×, (d) depicts PR Curve @ 400×, (e) depicts ROC curve @ 40×, (f) depicts ROC curve @ 100×, (g) depicts ROC curve @ 200×, and (h) depicts ROC curve @ 400×.
Diagnostics 12 01152 g007
Figure 8. Malignant Multiclass Classification. (a) depicts the PR Curve @40×, (b) depicts PR Curve @100×, (c) depicts PR Curve @ 200×, (d) depicts PR Curve @ 400×, (e) depicts ROC curve @ 40×, (f) depicts ROC curve @ 100×, (g) depicts ROC curve @ 200×, and (h) depicts ROC curve @ 400×.
Figure 8. Malignant Multiclass Classification. (a) depicts the PR Curve @40×, (b) depicts PR Curve @100×, (c) depicts PR Curve @ 200×, (d) depicts PR Curve @ 400×, (e) depicts ROC curve @ 40×, (f) depicts ROC curve @ 100×, (g) depicts ROC curve @ 200×, and (h) depicts ROC curve @ 400×.
Diagnostics 12 01152 g008
Figure 9. The visualization of the implementation steps of the DEEP_Pachi model. (a) depicts the input image, (b) the input image patches, (c) learnable position embedding of the input image patches, and (d) attention matrix.
Figure 9. The visualization of the implementation steps of the DEEP_Pachi model. (a) depicts the input image, (b) the input image patches, (c) learnable position embedding of the input image patches, and (d) attention matrix.
Diagnostics 12 01152 g009
Figure 10. The visualization of the implemented DEEP_Pachi Attention.
Figure 10. The visualization of the implemented DEEP_Pachi Attention.
Diagnostics 12 01152 g010
Table 1. Robustness and constraints of various imaging techniques for BC diagnosis and treatment.
Table 1. Robustness and constraints of various imaging techniques for BC diagnosis and treatment.
Imaging TechniquesRobustnessConstraintsPublic Datasets
MG1. Reliable and premium approach for capturing, storing, and processing images of breast tissue [52,53]
2. Unlike HP images, they do not need a comprehensive experience or professional understanding to analyze and classify.
1. Due to their microscopic dimensions and scattered form features, they have restricted abilities in acquiring segments and sub in the human breast [54].
2. Unsuitable for detecting breast cancer in thick breasts due to the absence of malignant tissues [55].
3. Not reliable in identifying BC; hence more screening may be necessary for accurate assessments [56].
-BCDR
-CBIS-DDSM
-MIAS
-Mini-MIAS
-DDSM
-InBreast
US1. Does not make patients vulnerable to dangerous rays and is thus regarded exceedingly safe, particularly for expectant mothers [57].
2. These are specifically convenient imaging techniques for identifying BC in thick breasts, where MGs fail [58].
3. Allows for viewing a breast tumor from multiple viewpoints and configurations, lowering the possibility of a negative result assessment.
1. Often yield false diagnoses if the scanner probe is not moved or pushed appropriately [59].
2. They cannot correctly portray the tumor outline in the breast due to its signal weakness to the human muscles [60].
3. US images are of low quality compared to the images of MGs; thus, obtaining ROI for more advanced analysis is challenging with US imaging.
-BCDR
-BUSI
MRI1. MRI can detect questionable spots, which can be explored further with autopsy (MRI-assisted biopsy).
2. MRI, just like US, does not make patients vulnerable to any dangerous radioactive materials.
3. MRI gives a thorough description of soft breast internal tissues as well as the ability to record
1. To improve MRI images, supplement chemicals are frequently administered, which might cause sensitivities or other issues and are thus not suggested for patients, particularly renal patients [61].
2. MRI is typically not suggested throughout pregnancy [62] and is primarily advised as a follow-up test only after an MGs-based examination has been performed.
3. MRI is a pricey procedure relative to MGs or US; hence, it is not often used for BC diagnosis. MRI offers highly accurate data about the interior breast tissues, but it can overlook some malignant areas that MGs can identify [63].
Duke-Breast-Cancer
RIDER Breast MRI
HP1. Images of HP are RGB images that are very efficient in diagnosing many types of malignancies and provide a greater efficacy for an early phase of BC.
2. An in-depth study of breast tissues is feasible with HP images, resulting in a more reliable examination of BC than other imaging alternatives.
3. Multi ROI images may be produced from full flip HP images, increasing the likelihood of detecting cancer tissues and lowering the number of false positives.
1. HP images are obtained by mammogram, which is an expensive approach with significant potential complications, necessitating special attention from pathologists as comparable to other imaging alternatives
2. HP images are easy to misinterpret, and the conventional examination of HP images takes a long time [64]. As a result, experts are needed for correct interpretation.
3. Extreme caution is required during histopathology specimen preparation (From the extraction of a tissue sample from the breast to the application of microscope to the extracted tissue sample, the adjustment/control of the color disparities caused by different staining processes) to reduce the possibility of a mistaken diagnosis.
UCI (Wisconsin)
BICBH
BreakHis
Identified Public site for BC Datasethttp://peipa.essex.ac.uk/info/mias.html, http://marathon.csee.usf.edu/Mammography/Database.html, https://biokeanos.com/source/INBreast, https://bcdr.ceta-ciemat.es/information/about
https://wiki.cancerimagingarchive.net/display/Public/, https://www.repository.cam.ac.uk/handle/1810/250394, accessed on 20 March 2022.
Table 2. Summary of the related studies.
Table 2. Summary of the related studies.
RefYearImage TypeTechniquesTaskRecorded Result
[8]2017-ConvNet classifierDetection 75.86% Dice coefficient
71.62% positive prediction
96.77% negative prediction (pixel-by-pixel evaluation)
[12]2017-Multiscale Basic Image Features, Local Binary Patterns, Random Decision Trees ClassifierClassification84% Accuracy
[32]2017BreaKHis
Augmented BreaKHis
CSDCNN modelMulti-Classification93.2% accuracy
[37]2017-Hybrid Contour Model-Based Segmentation with SVM ClassifierBinary Classification
Multi-Classification
88% AUC.
[36]2018BreaKHisVGG16, VGG19, and ResNet50 with Logistic Regression Binary Classification92.60% accuracy, 95.65% AUC,
95.95% precision score
[33]2018BACH (ICIAR 2018)Two-Stage CNNMulti-Classification95% accuracy
[4]2018BreaKHisDL model with handcrafted featuresMitosis detection92% Precision
88% Recall
90% F-Score
[5]2018BreaKHisTransfer Learning based CNNMitosis detection15% F1-Score improvement
[27]2018TMAD, OUHSCTransfer Learning.Binary Classification90.2% Accuracy with GoogleNet
[23]2019BACH (ICIAR 2018)Hybrid CNN + Deep RNNMulti-Classification91.3% Accuracy
[24]2019BreaKHisSmall SE-ResNetBinary Classification
Multi-Classification
98.87–99.34% Binary Classification Accuracy
90.66–93.81%
Multi-Classification Accuracy
[25]2019BACH (ICIAR 2018)
Bioimaging2015
Extended Bioimaging2015
CNN + RNN + Attention MechanismMulti-Classification-
[6]2019BreaKHisMask R-CNN network, with features obtained from Handcrafted and DCNNMitosis detection-
[26]2019BreaKHis
L.R.H. hospital Peshawar Data
Transfer Learning.
GoogleNet, VGGNet, ResNet
Binary Classification97.53% Accuracy
[28]2019BreaKHisD2TL and ICELMBinary ClassificationClassification Accuracy 96.67%, 96.96%, 98.18%
[29]2019BreaKHisInception_V3
Inception_ResNet_V2
Multi-Classification-
[30]2019BreaKHis
BACH (ICIAR 2018)
Deep CNN with Wavelet decomposed magesBinary Classification
Multi-Classification
96.85% Accuracy
98.2% Accuracy
[34]2019 deep selective attentionClassification98% accuracy
[21]2020B.H.I.s
BreaKHis
Modified Inception Network/Transfer LearningClassification
multiclass
-
[22]2020BreaKHisResHist model (Residual Learning CNN)Classification84.34% Accuracy
90.49% F1-Score
92.52% Accuracy (DA)
93.45% F1-score (DA)
[31]2020BACH (ICIAR 2018)Attention Guided CNNDetection and Classification90.25 ± Accuracy
0.98425 AUC
Single 88% Accuracy
Ensemble 93% Accuracy
[35]2020BreaKHis
BACH (ICIAR 2018)
CNN and multi-resolution Spatial Features wavelet transformBinary Classification
Multi-Classification
97.58% Accuracy
97.45% Accuracy
[38]2020BreaKHisCNN With Several ClassifiersBinary Classification
[39]2020 VGG16, VGG19, and ResNet50 with SVM
[19]2021BHIsDCNN with several OptimizersClassification99.05% accuracy
Table 3. BreaKHis dataset.
Table 3. BreaKHis dataset.
ClassSub_ClassMagnification TotalNos_Patients
40×100×200×400×
BenignAdenosis11411311110644424
Fibroadenoma2532602642371014
Phyllodes_tumor109121108115453
Tubular_adenoma149150140130569
MalignantDuctal_carcinoma864903896788345158
Lobular_carcinoma156170163137626
Mucinous_carcinoma205222196169792
Papillary_carcinoma145142135138560
Total 1995208120131820709082
Table 4. Data augmentation Python algorithm.
Table 4. Data augmentation Python algorithm.
Import Augmentor
def upsample(dir, num_samples):
  p = Augmentor.Pipeline(dir)
  p.rotate(probability = 1, max_left_rotation = 5, max_right_rotation = 5)
  p.zoom(probability = 0.2, min_factor = 1.1, max_factor = 1.2)
  p.skew(probability = 0.2)
  p.shear(probability = 0.2, max_shear_left = 2, max_shear_right = 2)
  p.crop_random(probability = 0.5, percentage_area = 0.8)
  p.flip_random(probability = 0.2)
  p.sample(num_samples)
  p.random_distortion(probability = 1, grid_width = 4, grid_height = 4, magnitude = 8)
  p.flip_left_right(probability = 0.8)
  p.flip_top_bottom(probability = 0.3)
  p.rotate90(probability = 0.5)
  p.rotate270(probability = 0.5)
src_dir = ‘D:/Pachigo/Breast_Cancer/Train/Benign/40
src_dir = ‘D:/Pachigo/Breast_Cancer/Train/Benign/100
src_dir = ‘D:/Pachigo/Breast_Cancer/Train/Benign/200
src_dir = ‘D:/Pachigo/Breast_Cancer/Train/Benign/400
upsample(src_dir, 1500)
Table 5. Optimal parameters of all implemented models.
Table 5. Optimal parameters of all implemented models.
ModelsLearning RateLoss FunctionTrainable ParameterNon-Trainable ParameterTotal ParameterOptimizersNos. of Epochs
DenseNet2010.001Categorical smooth loss1,106,17918,321,98419,428,163AdamEarly stop
VGG160.001Categorical smooth loss598,40314,714,68815,313,091AdamEarly stop
InceptResNetV20.001Categorical smooth loss393,47554,336,73654,730,211AdamEarly stop
Xception0.001Categorical smooth loss1,179,90720,861,48022,041,387AdamEarly stop
Ensemble0.001Categorical smooth loss43,872,89933,036,67276,909,571AdamEarly stop
DEEP_Pachi0.001Categorical smooth loss766,29133,036,84833,803,139AdamEarly stop
Table 6. Parameter sensitivity analysis of DEEP_Pachi.
Table 6. Parameter sensitivity analysis of DEEP_Pachi.
Nos. of Pre-Trained NetworkNos. of Self-Attention HeadsLearning RateNos. of EpochAccuracy (%)Precision (%)F1_Score (%)
123 × 10−3500.960.960.96
223 × 10−3500.960.970.96
323 × 10−3500.970.970.97
143 × 10−3500.960.970.96
243 × 10−3500.970.980.97
343 × 10−3500.980.970.97
183 × 10−3500.960.970.97
283 × 10−3500.970.990.98
383 × 10−3500.980.980.98
1163 × 10−3500.980.980.98
2163 × 10−3500.991.00.98
3163 × 10−3501.00.980.99
Table 7. Transfer learning classification result. The experiment was performed specifically for the selection of the proposed model backbone.
Table 7. Transfer learning classification result. The experiment was performed specifically for the selection of the proposed model backbone.
ModelsACC (%)SEN (%)SPE (%)PRE (%)F1_Score (%)AUC (%)
40× Magnification-Benign
DenseNet2011.01.01.01.01.01.0
InceptionResNet0.990.990.990.980.980.99
VGG161.01.01.01.01.01.0
Xception1.01.01.01.01.01.0
100× Magnification-Benign
DenseNet2011.01.01.01.01.01.0
InceptionResNet1.01.01.01.01.01.0
VGG160.990.990.990.980.980.99
Xception0.990.990.990.980.980.99
200× Magnification-Benign
DenseNet2011.01.01.01.01.01.0
InceptionResNet0.990.980.990.990.980.98
VGG161.01.01.01.01.01.0
Xception1.01.01.01.01.01.0
400× Magnification Benign
DenseNet2011.01.01.01.01.01.0
InceptionResNet1.01.01.01.01.01.0
VGG160.990.980.990.990.980.98
Xception0.990.980.990.990.980.98
40× Magnification Malignant
DenseNet2010.980.990.990.950.970.99
InceptionResNet0.940.950.970.830.880.96
VGG160.940.930.960.820.860.94
Xception0.940.930.960.820.860.94
100× Magnification Malignant
DenseNet2010.970.980.980.910.940.98
InceptionResNet0.940.950.970.830.880.96
VGG160.940.940.960.830.870.95
Xception0.960.960.970.860.900.97
200× Magnification Malignant
DenseNet2010.980.970.980.940.950.98
InceptionResNet0.930.940.960.800.850.95
VGG160.920.930.950.790.840.94
Xception0.950.950.970.850.890.96
400× Magnification Malignant
DenseNet2010.980.980.980.920.950.98
InceptionResNet0.960.970.980.880.920.97
VGG160.970.960.980.900.930.97
Xception------
ACC denotes Accuracy; SEN = Sensitivity; SPE = Specificity; PRE = Precision; AUC = Area under the ROC Curve.
Table 8. Binary classification using DEEP_Pachi.
Table 8. Binary classification using DEEP_Pachi.
ModelsACC (%)SEN (%)SPE (%)PRE (%)F1_ScoreAUC
100× Magnification
Backbone Network0.990.990.990.990.990.99
DEEP_Pachi1.01.01.01.01.01.0
400× Magnification
Network Backbone0.950.930.930.950.940.93
DEEP_Pachi0.960.960.960.970.950.96
Table 9. Multiclass classification using DEEP_Pachi vs. the network backbone.
Table 9. Multiclass classification using DEEP_Pachi vs. the network backbone.
ModelsACC (%)SEN (%)SPE (%)PRE (%)F1_Score (%)AUC (%)
40× Magnification-Benign
Network Backbone1.01.01.01.01.01.0
DEEP_Pachi1.01.01.01.01.01.0
100× Magnification-Benign
Network Backbone1.01.01.01.01.01.0
DEEP_Pachi1.01.01.01.01.01.0
200× Magnification-Benign
Network Backbone1.01.01.01.01.01.0
DEEP_Pachi1.01.01.01.01.01.0
400× Magnification Benign
Network Backbone1.01.01.01.01.01.0
DEEP_Pachi1.01.01.01.01.01.0
40× Magnification Malignant
Network Backbone0.970.980.980.920.940.98
DEEP_Pachi0.991.01.00.960.980.98
100× Magnification Malignant
Network Backbone0.970.980.980.910.940.98
DEEP_Pachi0.991.01.00.940.980.98
200× Magnification Malignant
Network Backbone0.960.960.980.900.920.97
DEEP_Pachi0.990.990.990.950.980.98
400× Magnification Malignant
Network Backbone0.980.980.980.920.950.98
DEEP_Pachi1.01.01.00.970.990.99
Table 10. Result comparison with the state-of-the-art result using the BreaKHis Dataset.
Table 10. Result comparison with the state-of-the-art result using the BreaKHis Dataset.
Ref/YearApproachData TypeClassification
Type
Accuracy (%)
40×100×200×400×Binary
[112] 2018Ensemble (CNN + LSTM)BreaKHis 88.785.388.688.4
[113] 2018DenseNet CNNBreaKHis 93.697.495.994.7
[77] 2018XceptionBreaKHis 95.393.493.191.7
[114] 2018KAZE features + Bag of FeaturesBreaKHis 85.980.478.171.1
[102] 2019CNNBreaKHis 77.2
CNN + DA 76.7
CGANs based DA 77.3
DA + CGANs based DA 75.2
CNN 75.4
CNN + DA 75.9
CGANs based DA 78.5
DA + CGANs based DA 78.7
[115] 2019Deep ResNet + CBAMBreaKHis 91.291.792.688.9
[103] 2019Transfer Learning (VGG16 + VGG19 + CNN) 98.298.398.297.5
98.1
[116] 2019IRRCNNBreaKHis 98.097.697.397.4
[85] 2019Inception_V3BreaKHisMulticlass90.385.484.082.1
Binary97.794.287.296.7
Inception_ResNet_V2Multiclass98.498.797.997.4
Binary99.999.91.099.9
[80] 2019BHCNet-6 + ERFBreaKHisMulticlass94.494.592.391.1
CNN +SE-ResNetBinary98.999.099.399.0
[117] 2020Deep CNNBreaKHis 73.476.883.275.8
[94] 2020VGG16 + SVM
(Balanced + DA)
BreaKHis 94.092.991.291.8
Ensemble (VGG16 + VGG19 + ResNet 50) + RF Classifier 90.390.187.486.6
Ensemble (VGG16 + VGG19 + ResNet 50) + SVM Classifier 82.287.686.583.0
[78] 2020ResHist (RL Based 152-layer CNN)BreaKHis 86.487.391.486.3
[64] 2020VGGNET16-RFBreaKHis 92.293.495.292.8
VGGNET16-SVM94.195.197.093.4
[118] 2020CNN + spectral–spatial featuresBreaKHisMalignant97.697.497.397.0
[100] 2020NucTraL+BCFBreaKHis 96.9
[119] 2020ResNet50 + KWE LMBreaKHisMalignant88.487.190.084.1
[93] 2020AlexNet + SVMBreaKHis 84.187.589.485.2
VGG16 + SVM86.487.886.884.4
VGG19+SVM86.688.185.881.7
GoogleNet + SVM81.084.582.579.8
ResNet18 + SVM84.084.382.579.8
ResNet50 + SVM87.787.890.183.7
ResNet101 + SVM86.488.990.183.2
ResNetInceptionV2 + SVM86.386.387.181.4
InceptionV3 + SVM85.884.786.882.9
SqueezeNet + SVM81.283.784.277.5
[120] 2020Optimized CNNBreaKHis 80.876.679.974.2
[110] 2020InceptionV3 + BCNNsBreaKHis 95.794.794.894.5
96.1
[105] 2020VGG16 + SVMBreaKHis 78.685.282.079.6
VGG19 + SVM77.379.183.079.1
Xception + SVM81.682.978.476.1
ResNet50 + SVM86.486.084.382.9
VGG16 + LR78.885.281.279.1
VGG19 + LR77.682.482.277.8
Xception + LR82.479.679.483.1
ResNet50 + LR83.186.784.080.1
[107] 2020Shearlet-based featuresBreaKHis 89.488.086.083.0
Histogram-based features.92.693.995.094.7
Concatenating all features98.297.297.897.3
[104] 2021MA-MIDNBreaKHis 96.395.797.095.4
[108] 2021AhoNet (Resnet18 + ECA + MPN-COV)BreaKHis 97.597.399.297.1
[109] 20213PCNNB-NetBreaKHis 92.393.197.092.1
[121] 2021APVECBreaKHis 92.190.295.092.8
[111] 2021Stochastic Dilated Residual Ghost ModelBreaKHis 98.498.496.397.4
[105] 2021Transfer Learning via Fine-tuning StrategyBreaKHis 99.399.098.198.8
98.4
[122] 2021BCHisto-NetBreaKHis100× Magnification 89
OursDEEP_PachiBreaKHis 99.899.899.81.099.8
Table 11. Result comparison with the state-of-the-art result using the ICIAR 2018 Dataset.
Table 11. Result comparison with the state-of-the-art result using the ICIAR 2018 Dataset.
Ref/YearApproachData TypeAccuracy (%)
[18] 2018DCNN + SVMBACH77.8
[123] 2018Pre-trained VGG-16BACH83.0
Ensemble of three DCNNs 87.0
[124] 2018Ensemble (DenseNet 169 + Denseness 201 + ResNet 34)BACH90.0
[20] 2019All Patches in One DecisionBACH90%
92.5
[125] 2019Ensemble (DenseNet 161+ ResNet 152 + ResNet 101)BACH91.8
[126] 2020Hybrid Features + SVMBACH92.2
Hybrid Features + MLP85.2
Hybrid Features + RF80.2
Hybrid Features + XGBoost82.7
[87] 2020Attention Guided CNNBACH93.0
[99] 2020Random ForestBACH91.2
SVM95.0
XGBoost42.5
MLP91.0
[104] 2021MA-MIDNBACH93.57
[108] 2021AhoNet (Resnet18 + ECA + MPN-COV)BACH85.0
[101] 2021Inception V3 + XGBoostBACH87.0
[127] 2022DSAGu-CNNBACH 96.47
OursDEEP_PachiBACH99.9
Publisher’s Note: MDPI stays neutral with regard to jurisdictional claims in published maps and institutional affiliations.

Share and Cite

MDPI and ACS Style

Ukwuoma, C.C.; Hossain, M.A.; Jackson, J.K.; Nneji, G.U.; Monday, H.N.; Qin, Z. Multi-Classification of Breast Cancer Lesions in Histopathological Images Using DEEP_Pachi: Multiple Self-Attention Head. Diagnostics 2022, 12, 1152. https://doi.org/10.3390/diagnostics12051152

AMA Style

Ukwuoma CC, Hossain MA, Jackson JK, Nneji GU, Monday HN, Qin Z. Multi-Classification of Breast Cancer Lesions in Histopathological Images Using DEEP_Pachi: Multiple Self-Attention Head. Diagnostics. 2022; 12(5):1152. https://doi.org/10.3390/diagnostics12051152

Chicago/Turabian Style

Ukwuoma, Chiagoziem C., Md Altab Hossain, Jehoiada K. Jackson, Grace U. Nneji, Happy N. Monday, and Zhiguang Qin. 2022. "Multi-Classification of Breast Cancer Lesions in Histopathological Images Using DEEP_Pachi: Multiple Self-Attention Head" Diagnostics 12, no. 5: 1152. https://doi.org/10.3390/diagnostics12051152

Note that from the first issue of 2016, this journal uses article numbers instead of page numbers. See further details here.

Article Metrics

Back to TopTop