Detection of Tuberculosis Bacilli in Tissue Slide Images using HMLP Network Trained by Extreme Learning Machine

M. K. Osman Faculty of Electrical Engineering,Universiti Teknologi MARA, 40450 Selangor, Malaysia, phone: +60194060406, e-mails: khusairi@ppinang.uitm.edu.my M. Y. Mashor School of Mechatronic Engineering, Universiti Malaysia Perlis, 02600, Perlis, Malaysia, phone: +6049798335, e-mail: yusoff@unimap.edu.my H. Jaafar Department of Pathology, School of Medical Science, Universiti Sains Malaysia, 16160 Kelantan, Malaysia, phone: +6097664229, e-mail: hasnan@kb.usm.my


Introduction
Tuberculosis, commonly known as TB, is a killer disease caused by infection by Mycobacterium tuberculosis.The bacteria usually attacks the lung causing pulmonary TB (PTB), yet there are also cases where it strikes other parts of the human body, referred as extrapulmonary TB (EPTB).The clinical diagnosis of TB is performed by a microscopic examination using either the fluorescence microscope or light microscope.For PTB, the diagnosis is conducted by the sputum examination.For EPTB, the biopsied tissue of the infected organ is used for diagnosis.Clinical specimens are stained using auraminerhodamine stain for analysis using fluorescence microscope while Ziehl-Neelsen (ZN) stain is used for the light microscope.
Early detection and rapid treatment are important in the control of TB.However, the conventional manual screening of TB is time-consuming and tedious, especially for detecting negative slides.Furthermore, an accurate diagnosis requires an assessment conducted by a welltrained microbiologist.However, high incidence rates of TB have been recorded in low and medium-income countries [1], which often lack of well-trained medical staff.These problems pose a huge obstacle in obtaining rapid and accurate results, hence preventing the early detection of TB.
The rapid advancement of computer hardware, software, image processing algorithm and artificial intelligence has led to the development of various computer-aided systems for TB detection.The systems aim to assist medical technologist in the diagnostic process.A number of techniques for automated PTB detection using sputum smear had been proposed in the literature.Some of these techniques used the fluorescence microscopic image [2][3] whilst others used light microscopic image [4][5][6].The fluorescence microscope is about 10% more sensitive in its performance as compared to the light microscope in detecting TB bacilli.However, due to the high cost and difficulty in equipment maintenance, the latter remains the most widely used tool for the screening and diagnosis of TB [7].
While most of the researches concerned with the PTB, little research has been done on EPTB detection.According to [8], EPTB contributes 15% to 20% of all TB cases and accounts for more than 50 % of the cases in HIV-positive patients.Sadaphal et al. [4] were the first to propose an automatic method for identifying the TB bacilli in both sputum smear and tissue sections.However, only the result on the sputum-smear was shown in the report.Tadrous [9] proposed an image ranking algorithm to assist medical technologist in searching for TB bacilli.The method used 'colour score' and 'shape score' to calculate the probability of containing of TB bacilli, and ranked all the images according to the highest probability of the presence of bacilli.However, the method needs medical technologist to determine manually the positions and regions of TB bacilli in a tissue slide image.More recent work by Osman et al. [10] used the k-means clustering and the saturation component of C-Y colour model to segment the TB bacilli in tissue slide images.
Recent trends in automated TB detection have implemented artificial intelligence as a tool for classification.For neural network, MLP network is the most commonly used.However, the major drawback of MLP learning algorithms is slow learning, due to the slow gradient-based learning algorithm and iterative tuning of all parameters.In turn, the Extreme Learning Machine (ELM) [11] which gives faster learning speed and better performance has been claimed to be able to overcome the issues.
The current study focuses on the automated detection of TB bacilli in a ZN-stained tissue image.Images are captured from ZN-stained tissue slides using a light microscope.Several image processing tasks are implemented in segmenting the bacilli.
In this study, the ELM is modified to speed-up the training of hybrid multilayered perceptron (HMLP) network, called HMLP-ELM network.The proposed HMLP-ELM network is evaluated for the TB bacilli detection.The study also draws a comparison with the original version of HMLP which was trained using the Modified Recursive Prediction Error (MRPE) algorithm [12] and the more recent work using the Modified Recursive Least Square (MRLS) algorithm [13].

Theoretical background
This section gives a brief overview of the HMLP network.Also, the ELM is reviewed in terms of its basic concept and implementation.A method for training the HMLP network using the ELM training algorithm is presented at the end of the section.
A. Hybrid Multilayered Perceptron Network.One of the most common and popular neural networks is MLP.It is a feed forward neural network, which consists of an input layer, hidden layer and output layer.Consider a MLP network with i N inputs, h N hidden nodes and ) (t v i is the i-th input signal at t-th sample.The output of the j-th hidden node is given by where , and N represent the weights that connect the input and hidden layers, the thresholds in the hidden nodes, the activation function and number of samples, respectively.In this study, the sigmoidal function is used as the activation function.The output of the k-th neuron, can be expressed as where 2 jk w and o N denote the weights that connect the hidden and output layers, and the number of output nodes, respectively. Mashor [12] has proposed a modified version of MLP network with additional linear connections called the hybrid multilayered perceptron network (HMLP) network.The HMLP network allows the input layers to be connected directly to the output layer using weighted connection as illustrated in Fig. 1.The output of k-th neuron, for a HMLP network is the sum of linear and nonlinear connections and can be written as where L ik w denotes the weights of the linear connection between the input and output layers.The weights and biases are estimated using a modified version of recursive prediction error called the Modified Recursive Prediction Error (MRPE) algorithm, which can be found in detail in [12].More recently, Al-Batah et al. [13] has introduced a modified version of recursive least square called the Modified Recursive Least Square (MRLS) algorithm to improve the performance of the HMLP network.In order to train the SLFNN using the ELM, initially, the weights between the input layer and the hidden layer, and the thresholds are randomly selected.Then, by using (1), the hidden layer output matrix is formed as follows In the case where the SLFNN has perfectly minimised the error function in the training process, the output of the SLFNN can be written in equivalent to the solution of linear system as shown in ( 5) where β is the matrix of hidden to output layer weights and T is the actual output.
Referring to (5), the optimal value of β can further be calculated using the Moore-Penrose generalized inverse of H, as shown in ( 8) Finally, the SLFNN output can be calculated using the value of H from (5) and β from (8), as follows A more detailed concept and description of the ELM can be found in [11].
C. HMLP network trained by ELM.In the present study, the ELM is utilised to train the HMLP network.For simplicity, the network will be referred as the HMLP-ELM network.For a HMLP-ELM network, the H and β matrices, as mentioned in ( 4) and ( 6), are modified to accommodate the linear connection, to yield: represent the input signals and represent the weights of the linear connection between the input and output layers, as stated previously.
In brief, the HMLP-ELM training estimates both the hidden to output layer weights and the linear connection weights, while in the ELM, only the hidden to output layer weights are estimated.

Methodology
This section discusses the proposed method to automate the TB bacilli detection.It consists of four main steps: image acquisition, image segmentation, feature extraction and classification.
A. Image acquisition.A total of 25 ZN-stained tissue slides taken from TB patients were analysed.All the slides were provided by the Pathology Department, Hospital Universiti Sains Malaysia, Kelantan.Images of tissue slides were acquired using the Luminera Infinity 2 digital camera mounted on the Nikon Eclipse 80i light microscope.The slides were analysed under 40× magnification.The system captured 24-bit RGB images at a resolution of 800×600 pixels and they were saved in the bitmap (.bmp) file format.Fig. 2 shows example of tissue slide images containing the TB bacilli.The bacilli are rodshaped and appear in red.B. Image segmentation.The manual staining of tissue poses variations in colour intensity.As a result, the intensity of bacilli and background tends to vary for different images, as illustrated in Fig. 2. Fig. 2, a and b show a normal-stained image.The bacilli appear in red and can easily be detected.However, in Fig. 2, c the bacilli are seen in bright red and a low contrast due to the problem of understaining.Fig. 2, d shows a tissue image with an overstained problem.Overstaining of the tissue slide will cause some of the backgrounds to remain as red as the bacilli colour.In brief, both states of understained and overstained have produced undesirable effects to the image intensity which may complicate the detection process.
The current study used the procedure for segmenting the TB bacilli as in [10].On the surface, the approach starts with removing pixels which are not related to the red colour using a CY-based colour filter.Then the k-mean clustering, with the cluster number, 2  k is used to segment the image.The saturation component of CY colour model has been chosen as an input feature.After the clustering process, the cluster which is not related to the TB bacilli is eliminated.In order to wipe off unwanted regions in various sizes, a 5×5 median filter, followed by region growing, is applied to the image.This filter is used to discard small artefacts and smooth the segmented region while region growing is employed to label and identify the region size.All the regions with less than 50 pixels or larger than 800 pixels in size, are considered as non-bacilli and therefore, are eliminated from the image.Fig. 3 shows the results of applying the proposed segmentation procedure.C. Feature extraction.It can be observed that the colour image segmentation is unable to completely remove all the unwanted regions with colour similar to that of the TB bacilli.Therefore, the representation of the bacilli in terms of their geometrical shape is required.In this work, a number of geometrical features have been extracted as follows:  Size (A) and parameter (P).The size refers to the number of pixels in a region, while the perimeter refers to the number of pixels in the boundary of a region.Both the size and perimeter are determined previously based on the region growing algorithm. Minimum and maximum radius (r min and r max ).These terms refer to the minimum and maximum distance of a pixel in the boundary from the region's centroid ) , ( c c y x as illustrated in Fig. 4.  Shape factor (SF).It describes the degree of circularity of a shape and is defined as  Eccentricity (e).Defined as the ratio of the length of the longest chord of the shape to the longest chord perpendicular.It can also be derived from Hu's moment invariants as where 1  and 2  are the first and second Hu's moment invariants [14].
 Dispersion (I).This refers to the ratio of the major chord length to area where x and y represent the pixel coordinates located at the region's boundary. Zernike moments.Zernike's moments are invariants against translation, rotation and scale.In this study, six Zernike's moments (up to the third order of Zernike's moments) as derived in [15] are extracted as features.

Results and discussion
In order to proof the robustness of the proposed method, 125 tissue slide images were selected so that they consist of various staining conditions such as properly stained, understained and overstained images.Then, a dataset containing 680 objects which belong to either 'TB' or 'possible TB' was collected from the images.From the dataset, 280 objects were identified as 'TB' and 400 objects belonged to 'possible TB'.
For each object, 13 geometrical features as described previously were taken into account serving as the input variables to the HMLP-ELM network.All the input data was normalised within the range [0, 1] to avoid some of the features from dominating the training process.The number of hidden node was varied from 1 to 50 to identify the optimum network structure, with each variation taking up 10 independent runs.
The classification performance was further illustrated by the comparing made with the HMLP-MRPE, HMLP-MRLS and the original ELM.The typical designing parameter for HMLP-MRPE network was chosen, as suggested in [12]   while for the HMLP-MRLS network, the proposed typical parameter can be seen in [13], and given as follows: The sigmoidal activation function was used in the hidden layer of all networks.For the HMLP-MRPE and HMLP-MRLS networks, each hidden node is analysed, starting from 1 to 50, with each analysis also consists of 10 independent runs.The SLFNN trained by the original ELM used the same procedure as the HMLP-ELM in training.The simulation for all networks was conducted in Matlab R2008b using a laptop with Intel Core2Duo 2.4 GHz CPU and 4G of RAM.
In order to perform a fair comparison for the performance evaluation, 75% and 25% samples were chosen randomly for training and testing, respectively.Ten runs, each with different training and testing datasets, were conducted and the results were averaged.Table I tabulates the classification performance of the HMLP-ELM, HMLP-MRPE, HMLP-MRLS and SLFNN-ELM network.The comparison is outlined, based on the average classification accuracy, average training time and optimum structure of each network.
Throughout the analysis, the HMLP-MRPE network had achieved the highest training accuracy, followed by the HMLP-MRLS, HMLP-ELM and SLFNN-ELM network.However, the proposed HMLP-ELM network is slightly better in term of generalization performance compared to the standard HMLP-MRPE network and the SLFNN-ELM.It achieved a testing accuracy of 96.47%, which is slightly higher than the both methods.Overall, the HMLP-MRLS network had outperformed all the networks in terms of its testing classification accuracy and network structure, with the testing accuracy of 97.06% and 4 hidden nodes.
Both the HMLP-ELM and SLFNN-ELM networks required more hidden nodes compared to the HMLP-MRLS and HMLP-MRPE networks.This is because the ELM analytical estimates the hidden to output layer weights, while the input to hidden weights and hidden layer biases are chosen randomly to obtain the solution.However, the HMLP-ELM network had required less training time compared to both HMLP-MRPE and HMLP-MRLS networks.It is approximately 13 times faster than the HMLP-MRPE network and 22 times faster than the HMLP-MRLS network.The SLFNN-ELM required less training time as only the hidden to output weights were estimated during training, compared to both the hidden to output weights and the linear connection that had to be estimated in the HMLP-MRPE network.The proposed HMLP-ELM offers several advantages over the standard HMLP-MRPE and HMLP-MRLS networks.The learning speed of the HMLP-ELM network is high and without any iteration, it is able to achieve a comparable generalization performance.It also manages to reach the solution directly without emerging problems of local minima and improper learning rate as usually faced in the conventional learning algorithm [11].In addition, the HMLP-ELM training algorithm is much simpler and requires no designing parameters compared to the MRPE and MRLS training algorithms.
Fig. 6 shows the detection result using the HMLP-ELM network.The regions marked with the blue circles represent objects which are classified as 'TB', while the red circles represent 'possible TB'.The term CoD assigned a confidence degree to the detection result.If the CoD value is lower than a certain threshold, for example, 50%, the re-diagnosed of the tissue slide can be conducted manually by microbiologists to confirm the presence of TB bacilli.

Conclusions
A method for detecting and classifying TB bacilli in the ZN-stained tissue slide using image processing and the HMLP-ELM network has been proposed.A number of 13 geometrical features have been extracted to represent the segmented regions.Then the HMLP-ELM network was used to classify these regions into 'TB' and 'possible TB'.
The proposed HMLP-ELM network has produced acceptable results in the shortest training time, compared to the HMLP network with the MRPE and MRLS training algorithms.The training algorithm is also proven to be simpler, easier to implement and with no designing parameter required.
The results presented in this work are based on the classification performance of 'TB' and 'possible TB'.Since the 'possible TB' may consist of either overlapped TB bacilli or outlier particles, the accuracy, sensitivity and specificity of the diagnosis cannot be determined yet.Therefore, future works will focus on identifying more suitable features to classify the 'possible TB' into overlapped TB and non-TB so that the diagnosis performance can be compared to the manual diagnosis and improved the reliability of the detection.

Fig. 1 .
Fig. 1.Schematic diagram of a HMLP network B. Extreme Learning Machine.The Extreme Learning Machine (ELM) was proposed by Huang et al. [11] to train a single layer feedforward neural network (SLFNN).The method was reputedly able to overcome the problem of slow training of the gradient-based learning algorithm in SLFNN.In order to train the SLFNN using the ELM, initially, the weights between the input layer and the hidden layer, and the thresholds are randomly selected.Then, by using (1), the hidden layer output matrix is formed as follows

Fig. 2 .
Example of tissue images with TB bacilli

Fig. 3 .
Fig. 3. Result of segmentation using the proposed procedure.(a) Original image and the result of applying (b) CY-based colour filter, (c) k-mean clustering (d) Median filter and (e) region growing

Fig. 4 .
Fig. 4. The r min , r max and centroid D. Detection and classification.All the extracted features are fed to the HMLP-ELM network for detection and classification.The current study introduces a classification: 'TB' is referred to the single bacilli, while the overlapped bacilli and red stains, which appear in various shapes, are treated as 'possible TB'.Based on the number of 'TB' and 'possible TB', the present study defines the Confidence of Detection, CoD (in percentage)for an image as formulated in(15)