Classification of Abdominal CT Images bearing Liver Tumor Using Structural Similarity Index and Support Vector Machine

Computed Tomographic (CT) imaging is extensively implemented for liver tumor visualization and detection. Computer aided image processing algorithms can provide aid to the physicians and radiologists in detecting deadly diseases of liver specifically cancerous liver tumors. This paper presents a novel image processing technique to automatically classify liver for its abnormality without going through the liver segmentation stage. The study is conducted on a dataset of 39 samples of abdominal CT images. The CT dataset comprises of images bearing unhealthy liver. The unhealthy liver is further divided into livers bearing malignant tumors namely hepatoma and benign tumors namely hemangiomas. The methodology adopted for the study comprises of feature extraction from original CT images of all types of liver with special focus on textural information. The extracted features undergo the process of classification for malignancy and benignancy of the liver tumor. The classifiers used for textural feature analysis include Support Vector Machines (SVM), K-Nearest Neighbor (KNN) and ensemble classifier. Amongst these classifiers SVM yields a classification accuracy of 100% as compare to KNN and ensemble classifier which give the classification accuracy of 94.7% and 52.6% respectively. The proposed method of classification applied on entire abdominal CT scans without segmentation is performed by using the feature extraction matrix of structural similarity index (SSIM), which gives an improved classification accuracy of 100% as compared to the traditional GLCM matrix. The methodology can be tested to classify the liver for malignancy using other non-invasive techniques of ultrasounds and Magnetic Resonance Imaging (MRI) as well.


INTRODUCTION
Medical image analysis of different organs in human body including liver, brain, lungs, heart etc. has been an area of research interest for years [1,2]. Liver is responsible for the purification of blood due to which it is exposed to impurities more than any other organ [3,4]. When the liver gets infected to an ailment like tumor, it is extremely necessary to get it diagnosed for any malignancy at the earliest. For this purpose, the non-invasive method of CT imaging is most commonly used by practitioners for disease identification [5]. Due to the limited exposure of 752 the classification of manually mined tumor patches using neural networks. The classification accuracy turned out to be 87% for differentiating malignant and benign cases. Later, Lee et al. [15] again manually extracted patches of normal and abnormal liver for classification into cyst, hemangioma, hepatoma and healthy liver. Region based shape descriptors and GLCM were employed for feature extraction. The average area under receiver operating characteristics curve (AUC) turned out to be 0.91 by using SVM for classification. K.Mala et al. [16] segmented liver from the CT images and proposed probabilistic neural network for classifying two liver diseases based on statistical feature analysis using wavelets. The classification accuracy turned out to be 95%. Alahmer and Ahmed [17] segmented the liver tumor for classification based upon the region and edge-based texture analysis. Multiple regions of interests (ROI) were chosen to classify the malignant and benign liver tumors using SVM which yield a classification accuracy of 98%. Various studies on liver tumor classification by semiautomatic extraction of focal liver region exist [18][19][20], however little work has been reported for a design that automatically classifies the entire abdominal CT scan for malignancy. In this respect the latest works include the concept adopted by Özyurt et al. [21] who proposed a novel method of whole abdominal CT classification into malignant and benign liver by adopted Perceptual-Hash based Convolutional Neural Network (PH-CNN). The methodology gave a classification accuracy of 98.2%. Doğantekin et al. [22] also employed entire CT classification for malignancy of liver tumor by using PH-CNN for feature extraction. The classification accuracy turned out to be 97.3% by using extreme learning machine classifier. It can be observed that classification of whole CT can provide much efficient approach for classification by reducing the liver or tumor segmentation time.
The remaining paper comprises of three sections: (2) the materials and methods section which defines the materials used during the research and describes the method adopted for algorithm design. (3) The results and discussion section gives the graphical and pictorial representation of experimental results. After results, the discussion on the performance of the proposed technique as compared with the state of art methods is carried out. (4) Finally the paper is completed with the conclusion section followed by acknowledgements and references.

Materials
This research is conducted on clinical dataset of CT images obtained from the laboratory and an open data source of Radiopaedia [23]. The data set comprises of 39 CT images amongst which 20 images contain malignant liver tumor named as hepatoma, 18 images contain benign liver tumor named as hemangioma and a single reference sample bear normal liver image. The images acquired from the CT scanner are investigated by the expert radiologist for disease identification using Radiant Dicom Viewer. The DICOM images viewed in portal venous phase are imported in MATLAB 2017 for the algorithm design. The CT images archived are in RGB format each having a size of 512×512×3 pixels with a slice thickness of 1-1.5mm. The study is conducted on Intel core i3, 2.20 GHz processor with 4GB RAM. Various sections associated with the methodology adopted in the study are elaborated below.

Image acquisition and grayscale conversion:
The methodology adopted for the study begins with the data acquisition process from the CT scanner. The acquired original CT images of normal and infected liver are converted to grayscale images as the textural information lies in the spatial distribution of varying shades of grayscale intensities [24]. The original CT images of normal and infected liver are converted into gray scale images with an intensity range from 0-255. After grayscale conversion the images undergo feature extraction process followed by classification as illustrated in Fig. 1. The textural features are extracted from the images depending upon the statistical distribution of intensities in the ROI. Textural and tonal characteristics of homogeneity, contrast, correlation, energy, luminance and structure are employed to form feature descriptors namely Gray Level Co-occurrence Matrix (GLCM) and structural similarity index (SSIM). Formation of GLCM deals with the image features extracted based on textural characteristics of homogeneity, contrast and correlation [24], as given in Table 1. GLCM portrays the texture of an image by manipulating the frequency of occurrence of the pairs of pixel with particular values and in a definite relation. SSIM investigates the information about image luminance, contrast and structure for image classification. Table 1 gives the expressions for the SSIM features [25], where p(i,j) gives the two dimensional pixel values of the image, denotes the standard deviation, µx and µy denote the local means of intensities of the two images x and y whose SSIM values are to be compared, σxy represents the cross covariance of image x and y. The features for GLCM and SSIM are extracted from original abdominal CT images as given in Fig. 2 so that the methodology can be implemented in clinical practice as well [26].
The abdominal CT images bearing benign and malignant liver tumors undergo feature extraction process by applying the GLCM feature descriptor directly to the original images of cancerous and noncancerous tumor. In case of structural similarity index, the value is computed for images by taking a reference index image into account. The healthy liver image , verified by the radiologist is chosen to be the reference image. All other input images denoted by , are compared with the reference image to compute SSIM using Equation (1) [25].

Classification:
The classification method adopted is a new strategy which has not been observed so far up to the best of our knowledge. The novel steps adopted in this paper involve the following focal points: • Consuming whole original CT image for liver tumor classification • Employment of SSIM feature extraction from original CT • Using healthy liver as a reference image for SSIM manipulation The computation of SSIM and GLCM parameters for the classification of images is carried out using three classifiers. The training dataset is assigned a labeled value of 1 for the textural features of benign tumor and 2 for textural feature of malignant tumor. These labeled data are fed to the three classifiers SVM, KNN and ensemble classifiers individually. The classification accuracy using GLCM features is compared with the classification using SSIM features in order to select the classifier and the extraction matrix which gives the maximum classification accuracy. The entire algorithm design is illustrated in Fig. 3.
The machine learning algorithm applied for the validation of the classification model is K-fold cross validation method where the parameter K gives the 754 number of folds [17]. In this method the complete set of data is partitioned into K separate folds. The value of K selected for the study is 5. The algorithm begins with the portioning or division of the input data into 5 sets or folds. For each set of observations the model is trained for out of fold data and the performance of the model for in fold data is assessed. After going through all the sets, the average test error is calculated for all the folds. The algorithm provides satisfactory estimation of the prediction accuracy for the classification model using multiple turns by considering all the data. The algorithm followed by each fold is shown in Fig. 4.

RESULTS AND DISCUSSION:
The experimentation is conducted on 20 samples of hepatoma (malignant tumor) and 18 samples of hemangioma (benign tumor) as justified by Fig.5 and Fig. 6 respectively. The complete set of data i.e. 38 samples is given to the classifier for training purpose and tested under 5 fold cross validation method as illustrated in Fig.4. Receiver Operating Characteristic (ROC) is employed as the quality metric to evaluate the performance of the classifier. ROC curve is plotted for the threshold values of output lying in the interval

Average Test Error Calculation over all Folds
Model Performance Assessment for In-fold data 755 of 0 and 1. The threshold values are calculated for true positive rates and false positive rates of a particular output class set for classification. The classifier, whose curve hold close more to the top left of the ROC curve, shows better operating characteristics. Therefore ROC curve helps in illustrating the specificity of the output class using the classifier under observation. The comparison of ROC for GLCM and SSIM features using SVM is shown in Fig.7 and Fig.8 respectively. Fig.7 shows that the classification using SVM classifier gives better results for SSIM feature escriptor in terms of Area Under the Curve (AUC) equal to 1 with 100% specificity of hepatoma. In Fig.  7 the specificity of hepatoma samples using SVM classifier for GLCM features is 83% with AUC=0.96.The extracted features as given in Table 1, are tested for classification using SVM, KNN and ensemble classifiers also. Table 2 shows the performance of all the three classifiers used for malignant and benign liver tumor classification by applying GLCM and SSIM feature descriptors.  The methodology adopted in the study gives excellent classification results when the features are extracted through SSIM and compared with normal liver tissues.
Since the changes occur in healthy liver when it moves towards a health alarming disease like tumor, therefore we have extracted image textural properties from tumorous livers and compared with normal liver for the classification matrix of SSIM. The results show that SSIM gives improved results as compare to the GLCM features. The innovation of the study lies in classification of the entire CT image using image processing algorithm without the segmentation of liver, which is an evolving trend that deviates from traditional tedious method of liver or tumor segmentation for classification. The methodology is compared with the existing studies in Table 3 and it shows that our method gives promising results as compare to others.

CONCLUSION
The paper presents an efficient method for liver tumor classification using SSIM feature extraction descriptor. The features of contrast, luminance and structural formation are taken into consideration. The study is conducted on 38 samples of liver bearing tumor verified by the radiologist. The research conducted for the classification of benign (hemangioma) and malignant (hepatoma) tumors yield excellent results of 100% classification accuracy. The methodology can be further extended to analyze more number of samples bearing different types of tumors and verified by the histopathological records for tumor malignancy. Further extension of our work lies in multidimensional study of the liver tumors such as on brain tumors conducted in past [31,32], so that an integrated system of imaging can be designed for efficient computer aided tumor diagnosis and surgery planning.

ACKNOWLEDGEMENT
The author would like to thank Dawood University of Engineering and Technology and NED University of Engineering and Technology for providing research facilities. The authors would also like to thank Dr. Syed Izhar Zaidi for his unending support regarding data collection during the research.