Convolutional Neural Networks for Recognition of Lymphoblast Cell Images

This paper presents the recognition for WHO classification of acute lymphoblastic leukaemia (ALL) subtypes. The two ALL subtypes considered are T-lymphoblastic leukaemia (pre-T) and B-lymphoblastic leukaemia (pre-B). They exhibit various characteristics which make it difficult to distinguish between subtypes from their mature cells, lymphocytes. In a common approach, handcrafted features must be well designed for this complex domain-specific problem. With deep learning approach, handcrafted feature engineering can be eliminated because a deep learning method can automate this task through the multilayer architecture of a convolutional neural network (CNN). In this work, we implement a CNN classifier to explore the feasibility of deep learning approach to identify lymphocytes and ALL subtypes, and this approach is benchmarked against a dominant approach of support vector machines (SVMs) applying handcrafted feature engineering. Additionally, two traditional machine learning classifiers, multilayer perceptron (MLP), and random forest are also applied for the comparison. The experiments show that our CNN classifier delivers better performance to identify normal lymphocytes and pre-B cells. This shows a great potential for image classification with no requirement of multiple preprocessing steps from feature engineering.


Introduction
Acute lymphoblastic leukaemia (ALL) is an acute malignancy of white blood cells, causing over production of immature lymphocytes, known as lymphoblasts, in the bone marrow. e disease progresses rapidly and inhibits the production of normal cells causing death among children and young adults. ALL is a heterogeneous disease, meaning that distinct treatments are required for different groups of patients according to subtypes of the leukaemia. Individual ALL subtypes response differently to particular chemotherapy. erefore, subtype recognition provides essential prognostic information for a treatment planning.
Considering WHO classification, ALL subtypes can be subdivided as T-lymphoblastic leukaemia (pre-T), B-lymphoblastic leukaemia (pre-B), and mature-B lymphoblastic leukaemia (mature-B) [1]. e identification of the subtypes requires a multiparametric approach, including morphology, immunophenotype, cytogenetic, and molecular findings. Despite having advanced techniques, a morphological examination of blood smear samples is still a procedure for initial screening. e morphological examination can be assisted by computer-based systems.
ere have been growing interests in developing tools using image analysis and pattern recognition methods for quantification and identification of leukocytes [2][3][4][5].
ey could bring the efficacy to the analysis in terms of time and accuracy and assist pathologists in studying different patterns or cells from microscopic images. Since this system requires only images not blood samples, it offers low-cost methods and enables historical data records for future used in remote diagnostic systems. e morphology of lymphocytes and ALL subtypes exhibits large variation among cells in the same class. At the same time, they show many characteristics which resemble cells belonging to different families. Some examples are shown in Table 1 which presents five samples of blood microscopic images of normal lymphocytes acquired from Labati et al. [6] and pre-T and pre-B lymphoblasts from the American Society of Hematology (ASH) image bank [7]. Morphologically, lymphocytes present a compact nucleus with smooth boundary, blue-purple nucleus color, and low nucleus/cytoplasm (N/C) ratios [6,8]. Instead, lymphoblasts exhibit irregularities with rough nucleus boundary, sparse red-purple nucleus color, and high N/C ratios [6,8]. Considering ALL subtypes based on WHO classification, pre-T and pre-B lymphoblasts have the following characteristics.
Pre-T cells vary considerably from small blasts with very condensed nuclear chromatin and indistinct nucleoli to larger blasts with finely dispersed chromatin and prominent nucleoli [1]. e sparse amount of cytoplasm is commonly presented. Some cytoplasmic granulation is frequently found, look like grains of dust in most cases and occasionally exhibits visible large granules [9]. Nuclei range from round to irregular to convoluted. Some characteristics, such as cleaved nuclei, cytoplasmic protrusion, or hand-mirror form, are also presented [7].
Pre-B cells exhibit various characteristics from small-sized cell with scant cytoplasm, condensed nuclear chromatin, and inconspicuous nucleoli to medium-sized cell with moderate amounts of light blue cytoplasm, occasionally, finely dispersed nuclear chromatin, and relatively prominent nucleoli [1]. Most have a high N/C ratio. Other characteristics, such as elongated form, hand-mirror form, round or irregular nuclear contour, are occasionally presented [7]. e automated cell recognition of these subtypes has to handle this complex problem. A dominant approach is hand-crafted feature engineering with classification algorithms such as SVM, kNN, and MLP. For this approach, features are first extracted using image processing techniques and domain knowledge, and the combination of useful features is selected as input for the classification algorithms.
is approach has some disadvantages, consequence of hand-crafted feature engineering. First, the approach may require domain knowledge expertise in determining useful features. Second, it relies on image processing techniques in extracting useful features without introducing additional bias and error.
ird, feature extraction operations are difficult to automate and possibly time-consuming.
Our deep learning approach implements a convolutional neural network which directly takes in pixel's values from images and slowly constructs useful features through the use of multilayer architecture. ese features are then used to recognize the patterns relevant to the classification problem. Another consideration is the size of data set. e number of data is limited in many real-world problems, as also shown in our problem. From this requirement, we utilize appropriate data augmentation techniques to increase the number of input images for training.
Our study aims to apply a deep learning approach for developing a recognition of lymphocytes and ALL subtypes including pre-T and pre-B cells from blood microscopic images. We omit mature-B since it is in rare cases compared to the other two subtypes. To assess the performance of our deep learning approach, we compare the prediction accuracy and sensitivity of our CNN classifier with SVM classifier employing hand-crafted feature engineering. To ensure a fair comparison, the SVM classifier is enhanced with feature selection and GA-based parameters optimization. In addition, two traditional machine learning classifiers, MLP and random forest, are also considered to realize where our CNN approach can be situated among other machine learning methods.

Related Work
e analysis of hematological images is generally divided into four major steps consisting of image preprocessing, segmentation, feature extraction and selection, and classification. A considerable amount of works has been focused on leukocytes segmentation [10][11][12][13][14][15][16]. For example, Mohapatra et al. [14] have proposed the segmentation method using colorbased clustering to obtain nucleus region and cytoplasm area from stained blood smear images. SVM classifiers are applied with relevant features and gain satisfactory results. e automated classification of different types of white blood cells has been demonstrated in [17,18]. In [17], Osowski and Markiewicz have presented fully automatic system able to recognize 17 classes of myelogenous leukaemia from images of bone marrow aspirate. Cells are segmented using watershed algorithm combined with region-growing and edge detection techniques. 117 descriptive features have been generated and selected using linear SVM. is algorithm has been improved by Osowski et al. [18]. e latter work has presented feature selection using genetic algorithms for feature selection along with SVM learning algorithm. e algorithm increases accuracy of the recognition by more than 25%.
Reta et al. [19] have proposed the method to categorize the two types of leukaemia, ALL and acute myeloid leukaemia (AML). e segmentation of blood cells is performed using contextual color and texture information to identify nucleus and cytoplasm region as well as to separate overlapped blood cells. e morphological, statistical, texture, size ratio, and eigen values features are extracted after segmentation to be used by various machine learning classifiers available in Weka.
Recently, deep learning techniques become promising choices for medical image analysis. For example, works in [20][21][22], convolutional neural networks have been applied as a methodology in microscopic analysis. Song et al. [20] have used deep learning method based on a superpixel and convolutional neural network to detect the cytoplasm region in cervical cancer cell segmentation. e CNN approach is compared to different algorithms which are backward propagation neural network (BPNN), probabilistic neural networks (PNN), support vector machine (SVM), and learning vector quantization (LVQ) algorithms. CNN is superior to other algorithms and produces an accuracy of 94.50% for nucleus region detection. For cytoplasmic and 2 Computational Intelligence and Neuroscience nucleus segmentation, CNN outperforms all three state-ofthe-art methods as measured by F-measure, precision, and recall. Zhao et al. [21] have proposed an automatic detection of white blood cells (WBCs) from peripheral blood images and classification of five types of WBCs: eosinophil, basophil, neutrophil, monocyte, and lymphocyte. Eosinophil and basophil from other WBCs are first classified by SVM with a granularity feature. Other three types are then recognized using convolutional neural network to extract features, and random forest uses these features to classify those WBCs.
Litjens et al. [22] have introduced deep learning as a technique to improve the objectivity and efficiency of histopathologic slide analysis. Convolutional neural networks are trained in two experiments which are prostate cancer identification in biopsy specimens and breast cancer metastasis detection in sentinel lymph nodes. ey show that this system holds great promise to reduce the workload of pathologists with increasing objectivity of diagnoses.

Materials and Methods
Blood microscopic images are acquired from two different collections. e first collection comprises normal white blood cells obtained from Labati et al. [6]. We acquire 93 color images, each containing single normal WBC. e second collection is composed of ALL subtypes: pre-T and pre-B cells from ASH image bank [7]. In the entire blood smear images, a single pre-T or pre-B cell is manually cropped from the whole scene. Each image contains a single cell and is rescaled to equal size of 256 × 256 pixels. In the conventional approaches, we need to define region of interest (ROI) from the background. Manual segmentation is performed to mask the whole cell region and the nucleus area.
en, the masks of nucleus and cytoplasm can be defined and stored as contour labels of the object.
In this work, a convolutional neural network applied for an image classification problem is called ConVNet. e ConVNet method directly uses RGB values of the cell images for the learning procedure which automatically extracts image features through a multilayer architecture. For the dominant approach, feature values are extracted from the object information of the image using image processing techniques. We conduct these values into the implementation of SVM with GA-based feature selection and parameters optimization, namely, SVM-GA. ese same feature values are also employed by the standard approaches of MLP and random forest. Further details of the proposed approach and implementation of the classifiers used in this work are presented in the following sections. [23][24][25]. e strength of a CNN lies on its ability to employ a multilayer architecture to automatically extract high-level features through a series of convolutional, nonlinear transformation, downsampling (pooling), and fully connected layers of the network.

Convolutional Neural Networks. Convolutional neural networks have shown success in image classification
To train a CNN for image classification, first the network architecture must be designed. is task is to determine the types, number, and order of layers in the network. e designed network, given a set of 2D images along with their corresponding class labels, attempts to find features useful for distinguishing the classes. A CNN employs a learning method that consists of two repeated and alternated passes, naming feedforward and backward pass.
A typical CNN's feedforward pass performs two major tasks. e first task is feature extraction via the use of multiple convolutional feature extraction (CFE) layers. For this task, an image is passed through multiple CFE layers in a serial manner. A CFE layer consists of three sublayers: a convolutional sublayer, followed by a nonlinear transformation sublayer, and then by a pooling sublayer. Each CFE layer takes features from the previous layer and constructs higher-level features. is process often repeats many times in order to eventually extract high-level features from the image. ese features then become input for the fully connected layers in the second task of a feedforward pass, which performs classification of the input image and obtains some error. In a backward pass, the error obtained from a feedforward pass propagates backward to adjust the weights in the convolutional sublayers, and therefore, they can better extract features relevant to the classification problem. e same error is also used to find proper weights for the fully connected layers.

Architecture of ConVNet.
e overall architecture of ConVNet used in this study is shown in Figure 1. e network consists of seven layers, excluding the input layer. e input layer takes in a 256 × 256 RGB color image when each color channel is processed separately. e first, second, and third layers of ConVNet are CFE layers. e first and second CFE layer each applies 32 of 3 × 3 filters to an image in the convolutional sublayer. e image's border is padded with 0 to maintain the image size of 256. e nonlinear transformation sublayer employs the ReLU activation function. e max pooling sublayer applies a 2 × 2 filter to the image which results in reducing the image size to its half. e third CFE layer has similar structure to the first one, except the number of filters is 64. At this point, ConVNet extracts 64 features, each represented by a 32 × 32 array for each color channel. e fourth layer is a flatten layer. e flatten layer transforms a multidimensional array into one-dimensional array by simply concatenating the entries of the multidimensional array together. e output of this flatten layer is a one-dimensional array of size 65536. e fifth layer is a fully connected artificial neural network (ANN) with the ReLU activation function that maps 65536 input values to 64 output values. e sixth layer is a dropout layer. 50 percent of the input values coming into this layer are dropped to zero to reduce the problem of overfitting. e seventh layer is a fully connected ANN with the sigmoid activation function that maps 64 input values to 3 class labels.

Procedure of ConVNet.
e overall procedure of image classification using ConVNet is presented in Figure 2. Since a large amount of data is essential in achieving high performance for CNN, we utilize data augmentation techniques to increase the number of images in the training set from 121 to 2420 images. e operations used for data augmentation are horizontal flip, shearing within 0.2 radians in the counterclockwise direction and zooming between 0.8 and 1.2.
First, we train ConVNet using the data in training set to find appropriated filters' weights in the three convolutional sublayers and the weights that yield minimum error in the two fully connected layers. Next, we evaluate ConVNet using the data in the validation set to obtain validation error and cross-entropy loss. We then train ConVNet again using a new training set created from data augmentation of the original 121 training images. We repeat the training of ConVNet in this same procedure until we complete 50 epochs. Last, we evaluate the performance of ConVNet using data in the test set.

Time-Complexity of ConVNet.
e time-complexity of ConVNet includes the time costs for the three CFE layers: the flatten layer, the dropout layer, and the two fully connected layers. For both training and testing, the time costs for the CFE layers dominate the overall complexity. Furthermore, for each CFE layer, the time cost for a convolutional sublayer succeeds the time cost for a nonlinear transformation and a max pooling sublayers combined.
In general, the total time-complexity of all convolutional sublayers can be written as follows [26]: where l is the index of a convolutional sublayer, d is the number of convolutional sublayers, n l is the number of filters in the lth layer, s l is the spatial size (length) of the filter in the lth layer, and m l is the spatial size of the output feature map in the lth layer. For ConVNet, which consists of three convolutional sublayers, the time-complexity can be estimated by calculating the total number of convolutional operations performed (per image) in a single feedforward or backward pass (as shown in Table 2).
In terms of the difference between training and testing times per image, training takes three times as long as testing since it requires both feedforward and backward passes while testing only performs a feedforward pass [26]. e classification step operates in the same manner as testing; therefore, it includes only the time for one feedforward pass.

Feature Extraction for SVM-GA, MLP, and Random Forest
To facilitate the process of cell recognition, we need numerical feature values imitating the details of characteristics presenting best correlated within the same class and enhancing the differences for cell images belonging to different classes. ese detailed values can be used to detect variations in shape, cell size, granulation, intensity, color, etc. e segmented nucleus and cytoplasm of each individual cell image are described by various numerical values representing features from three main groups: geometrical, textural, and color features. All features are presented in Table 3. e 46 features are generated for classification and summarized as follows.
(i) e geometrical feature is used to describe the differences of the structure, shape, and size of leukocyte as geometry. e geometrical feature extraction is mainly based on a region-based and a contour-based approach. In the region-based approach, each cell image is first converted to a binary image, and then, the features 1-7 are extracted from cell geometry. Features 1-5 are adopted from Mohapatra et al. [8].  Computational Intelligence and Neuroscience Features 6 and 7 are newly presented in this work, and they can be defined as follows. Feature 6 is the measurement of symmetry by folding a nucleus shape with respect to a line of symmetry referred to the nucleus major axis. e numerical value of shape symmetry can be defined as where part 1 is the overlapping area between two separated parts with respect to the line of symmetry and part 2 is the largest area between two separated parts. erefore, the nucleus shape is more symmetry when the value is closer to 1. Feature 7 is to measure how a cellular presents the hand-mirror shape. Hand-mirror cell (HMC) lymphoid leukaemia is an unusual variant of ALL, in which the lymphoblasts manifest distinctive hand-mirror morphologic features. As shown in Figure 3, the proportion a + c/b + c is used to identify how the cell is close to a pre-B cell or pre-T cell. e distances a and b are the semimajor axis and the semiminor axis of nucleus, respectively. c is the maximum distance from the center of the nucleus to the hand-mirror part of the cell.   [27] in features 16-21, Haralick's texture [28] in features 22-26, and Fourier descriptors [29] in features 27-34 are applied to detect the textural transformations.
e Fourier descriptors are obtained from the two-dimensional discrete forward and inverse Fourier transforms in features 27-30 and 31-34, respectively. (iii) e color appearance is an important characteristic that is used to examine the abnormality of lymphocytes since normal and malignant cells have different staining capacity and granulation. Excessive staining capacity of nuclei normally appears in chromatin abnormality, and variation in color intensity usually presents due to the existence of granules. e color variation can be measured as mean color intensity in RGB and HSV color space. ese features are calculated from nucleus region (35-40) and cytoplasm (41-46).

Classification of ALL Using SVM with GA-Based Parameters Optimization
6.1. Support Vector Machines for Classification. Support vector machines (SVMs) are based on the concept of decision planes that define decision boundaries and perform classification tasks by constructing hyperplanes in a multidimensional space [30]. To construct an optimal hyperplane, SVM employs an iterative training algorithm which is used to minimize an error function described in the following equation: subject to the constraints � 1, 2, . . . , N, where C is the penalty parameter, w is the vector of coefficients, b is a constant, and ε i represents the parameter for handling input data i. e index i labels the N training cases, y i ∈ −1, 1 { } represents the class label, and x i represents the independent variable. e kernel φ is used to transform data from the input space to the feature space.
One important choice when using SVMs is the selection of an appropriate kernel function that is needed for efficiently handling nonlinearly separable data sets. e radial basis function (RBF) kernel is often chosen for this purpose [31,32], but it has a drawback that all input features are considered equally important when computing similarities between two feature vectors. erefore, to make optimal use of SVMs with RBF kernels, preprocessing of the input features is important when one wants to achieve the highest possible accuracy. e RBF kernel on two samples x i and x j is defined in the following equation: where c is the gamma parameter. e behavior of the model depends on both parameters C and c. e parameter C actually determines how much penalty should be given for misclassification. e parameter c can be seen as the inverse of the radius of influence of samples selected by the model as support vectors. As the c increases, the support vector has less wide-spread influence which makes the algorithm try harder to avoid misclassifying training data and leads to overfitting.

GA-Based Feature Selection and Parameters
Optimization. From the aforementioned considerations of feature selection and parameter tuning, we adopt the GA approach from Huang and Wang [33] for these optimization tasks. For the chromosome encoding in this work, a string of binary values is used to define three parts: C, c, and the feature mask f. In Figure 4, g 1 , g 2 , and g 3 define bit strings of C, c, and f, respectively. e lengths of each part are n C , n c , and n f , which have the number of bits depending on the size of parameters used by the kernel function and the number of features from data set. C and c parts in the bit strings in Figure 4 must be decoded from binary to decimal by the following equation [33]: where P max and P min are the maximum and the minimum values of the parameter, d is decimal value of bit string, and l is the length of bit string. e GA operators used in our approach are reproduction or selection by roulette wheel mechanism, single-point Computational Intelligence and Neuroscience crossover, and mutation using bit alteration. Two parents are first selected by selection operator for reproduction. Based on a random probability of crossover p c , if crossover occurs, a position x of the string with size l is randomly chosen, and alleles at position x to l are exchanged from one parent to the other. If no crossover occurs, the parents are directly copied to the new population. Different from Huang and Wang [33], we consider the string of the parameters (C and c) part and the feature mask part differently since they have different meanings. e first part will be converted to decimal value, and the second part of the feature mask will be used directly. erefore, the crossover operator performs for each part separately.
Apart from the crossover operator, the mutation operator is used to perturb bit value with a low probability to maintain genetic diversity. Each bit of an individual can be reversed from 0 to 1 or 1 to 0 with probability p m . is applies for all individuals which are placed in the new population.
e overall procedure of the SVM classifiers with GAbased feature selection and parameters optimization is presented in Figure 5. In the SVM learning, GA operations are used to adjust proper parameters. After the trained SVM classifier is obtained in each round, the validation data with selected feature subset and parameters are tested by the SVM classifier. Each chromosome is evaluated according to the average classification accuracy obtained from the validation data. e optimized parameters (C and c) and the feature subset are finally obtained for the final SVM classifier which is evaluated using data in the test set.

Experiments
e original data set contains 363 images. ere are two types of the data set, one is the cell images for ConVNet and the other is the feature values extracted from the corresponding cell images for SVM-GA, MLP, and random forest. Both types of data are divided into training, validation, and testing data. Each set contains 31 normal cell images, 45 pre-T cell images, and 45 pre-B cell images. Samples from the data set are randomly selected with ten different seeds to generate ten different combinations for both types of data.
Training and validation data are used for building the models and testing data are for evaluating the classifiers' performance.
e quality of models is evaluated by accuracy and sensitivity. With n number of classes, there is a confusion matrix consisting of the elements C ij . e diagonal entries C ii represent the numbers of correctly recognized classes. Typically, the accuracy A can be defined in the following equation: Sensitivity S i for class i measures the ratio of the number of patterns that are correctly recognized in class i to the total number of patterns in class i, defined in the following equation [34]: Parameter settings for all four approaches are summarized in Table 4.

Results and Discussion
In the SVM-GA approach, the first task is to obtain an SVM classifier for a binary classification of normal lymphocytes and lymphoblast cells. e second SVM model is produced to distinguish pre-T and pre-B cells.
erefore, the SVM models with the selected feature subsets and the optimized parameters are used to classify three classes of cells: normal lymphocytes, pre-T, and pre-B cells. If the testing data are identified as normal, then the result is obtained. Otherwise, the testing data are further identified if it is pre-T or pre-B cells.
To evaluate the performance of our deep learning approach, we compare ConVNet with the dominant approach of SVM-GA and two traditional machine learning methods, namely, MLP and random forest. Table 5 depicts the accuracy results obtained from these approaches taking ten test sets and shows the average with standard deviation over the ten performance estimates. Considering the average accuracy, the two traditional approaches cannot achieve the accuracy above 80% while ConVNet and SVM-GA yield the average accuracy above 80% and produce comparable results with the difference on a very small margin. From the ten set runs, most of the results obtained by both ConVNet and SVM-GA are above 80% and have the number approximately ranging from 78-86%.
To explore the sensitivity according to each class, Table 6 depicts the comparative results of the sensitivity according to each class over ten test sets between ConVNet and SVM-GA, and Table 7 displays the comparative sensitivity results between MLP and random forest. Starting from the identification of normal lymphocytes, ConVNet, MLP, and random forest provide comparable high average sensitivity of almost 100%. ese three methods clearly outperform SVM-GA which can only achieve less than 95% sensitivity in identifying normal lymphocytes.

Computational Intelligence and Neuroscience
Comparing two subtypes, ConVnet, MLP, and random forest produce similar results in identifying pre-T cells with 68-70% sensitivity; however, SVM-GA is able to obtain higher sensitivity of 75%. For the classification of pre-B cells, ConVNet and SVM-GA deliver the sensitivity above 80% which is higher than those from MLP and random forest.
Considering the variances in the sensitivity measurements for all approaches, ConVNet produces lower standard deviations for lymphocyte and pre-B classifications. MLP and random forest generate high variances over ten test runs for pre-T and pre-B identifications whereas SVM-GA produces the highest variances among all four approaches for all three types of cell image classification. is behavior may be the indication of overfitting. e final results are drawn from the confusion matrix from the two best classifiers, ConVNet and SVM-GA, to reveal how the classifiers identify the testing data. e matrixes are chosen from the worst and best results of the accuracy depicted in Tables 8 and 9, respectively. e results show the relative misclassification between pre-T and pre-B cells. is is due to the high similarity of these two classes.

Conclusions
In this work, we present a deep learning approach to recognize normal lymphocytes and ALL subtypes defined by WHO classification. We implement a CNN, namely, Con-VNet, which directly takes raw images and automatically discovers useful features through a series of multilayer architecture. e performance of our deep learning model is evaluated against a dominant approach of SVM classifier, namely, SVM-GA, and two traditional machine learning approaches including MLP and random forest.  Cell images Figure 5: Feature selection and parameters optimization using the GA-based technique from [33].

Computational Intelligence and Neuroscience
In terms of prediction accuracy, the deep learning approach of ConVNet and the dominant approach of SVM-GA are able to clearly outperform the two traditional approaches of MLP and random forest. In fact, the average accuracy of ConVNet and SVM-GA is comparable. However, when the sensitivity which measures the accuracy   according to a particular class has been explored, we observe a clearer picture of the two classifiers' performance. Con-VNet performs better in detecting normal lymphocytes and slightly better in detecting pre-B cells. Regarding the classification of pre-T cells, neither classifiers can deliver good results above 80% accuracy. Although SVM-GA demonstrates its ability to detect pre-T cells better than ConVNet, it may have suffered from overfitting, as suggested by its consistently high variances.
For the problem of recognizing lymphoblast cells, a deep learning approach of CNN is superior to MLP and random forest in all three classes, and it is able to outperform the dominant approach of SVM classifier employing GA-based parameters optimization for two out of the three classes. Taking into consideration that a CNN method requires no hand-crafted feature engineering, which is an error-prone and possibly time-consuming preprocessing step, this deep learning approach demonstrates a great potential for lymphoblast cell image classification.
Data Availability e microscopic images are acquired from two different collections [6,7]. e images have been rescaled to equal size of 256 × 256 pixels, and each contains a single cell. e image data used in this work are available at http://mcs.sat.psu.ac. th/dataset/dataset.zip.

Additional Points
Hardware and Software. e computer used for our experiments is a Mac Pro configured with 3.7 GHz, Quad-Core Intel Xeon E5, 12 GB DDR3 and GPU AMD FirePro D300 2048 MB, running Mac OS X 10.11.6. e implementation is with python 2.7.12, Keras 1.2.2, and scikit-learn 0.19.2.

Conflicts of Interest
e authors declare that they have no conflicts of interest.