Convolution Neural Networks and Support Vector Machines for Automatic Segmentation of Intracoronary Optical Coherence Tomography

Cardiovascular diseases are closely associated with deteriorating atherosclerotic plaques. Optical coherence tomography (OCT) is a recently developed intravascular imaging technique with high resolution approximately 10 microns and could provide accurate quantification of coronary plaque morphology. However, tissue segmentation of OCT images in clinic is still mainly performed manually by physicians which is time consuming and subjective. To overcome these limitations, two automatic segmentation methods for intracoronary OCT image based on support vector machine (SVM) and convolutional neural network (CNN) were performed to identify the plaque region and characterize plaque components. In vivo IVUS and OCT coronary plaque data from 5 patients were acquired at Emory University with patient’s consent obtained. Seventy-seven matched IVUS and OCT slices with good image quality and lipid cores were selected for this study. Manual OCT segmentation was performed by experts using virtual histology IVUS as guidance, and used as gold standard in the automatic segmentations. The overall classification accuracy based on CNN method achieved 95.8%, and the accuracy based on SVM was 71.9%. The CNN-based segmentation method can better characterize plaque compositions on OCT images and greatly reduce the time spent by doctors in segmenting and identifying plaques.


Introduction
Cardiovascular diseases are closely associated with atherosclerotic plaque development and rupture. Accurate and detailed identification of atherosclerotic plaques advances the understanding and diagnosis of cardiovascular diseases. However, the current common image modalities such as magnetic resonance imaging (MRI) and intravascular ultrasound (IVUS) were unable to identify vulnerable coronary plaques due to their limited imaging resolution [1][2]. Optical coherence tomography (OCT) is a recently emerged advanced intravascular imaging technique which has high resolution approximately 10 microns and could provide accurate morphology for coronary plaques [3][4]. Moreover, it has the unique ability to identify plaques with fibrous cap thickness < 65 µm, an accepted threshold value for vulnerable plaques [5]. However, the current segmentation of OCT images in clinic is still mainly performed manually by physicians, which is time consuming and can lead to inter-observer and intra-observer variability [6]. As such, automatic segmentation and recognition of vulnerable plaques through quantification of plaque components have great clinical significance for cardiovascular research. To overcome time-consuming manual segmentation process, several methodologies have been proposed for automatic segmentation of OCT images. Van Soest et al. [7] presented a framework for automatic classification of atherosclerotic plaque constituents based on the optical attenuation coefficient of the tissue in IVOCT images. Athanasiou et al. [8] presented a fully automated methodology to detect the lumen border, identify the plaque region and detect four tissue types based on random forests (RF). Shalev et al. [9] and Guo et al. [10] applied support vector machine (SVM) to characterize plaques. Convolutional neural network (CNN) have become a popular method for image analysis and have recently been applied to OCT images. More recently, Abdolmanafi et al. [11] and Athanasiou et al. [12] investigated deep learning models based on CNN for tissue characterization. The above literature methods have demonstrated that both machine learning and deep learning can be applied to plaque segmentation with high accuracy. However, in all methods the annotations that were used were based on experts' experience and not on any form of a priori knowledge such as histology; histology/virtual-histology still remains the gold standard in intracoronary plaque segmentation. In addition, most of these methods were still limited by intricate image preprocessing and expensive computation.
In this paper, two intracoronary OCT segmentation methods, based on SVM and CNN, were implemented to identify plaque region and characterize different plaque components. The current gold standard for training and testing the methods was a based on a hybrid plaque estimation approach: experts' knowledge and virtual histology were combined in an attempt to present a more robust plaque segmentation method. Therefore, the results of the two segmentation algorithms, were compare on the same dataset.

Data Acquisition and Hybrid Gold Standard Formation
OCT and IVUS image data from 5 patients were acquired at Emory University with informed consent obtained. OCT were obtained from ILUMIEN OPTIS System (St. Jude, Minnesota, MN). The OCT catheter was traversed to the segment of interest and the catheter pullback was limited at a rate of 20 mm/sec. Following the OCT image acquisition, the IVUS catheter (Volcano Therapeutics, Rancho Cordova, CA) was traversed distally though the artery to the same coronary segment and the catheter pullback speed was at a standard rate of 0.5 mm/sec. Four different color were used to distinguish four plaque types in colored VH-IVUS (virtual histology IVUS) images: fibrous, fibro-fatty, lipid and dense calcified tissue (calcification, Ca in short) (see Fig. 1(b)). Dynamic angiography data were recorded which provided positions of both catheters for co-registration of OCT and IVUS images. Seventy-seven matched IVUS/VH-IVUS and OCT slices with good image quality and lipid cores were identified for our use in the training and validation of the segmentation methods. Manual OCT segmentation was performed by experts using VH-IVUS images as references and used as gold standard in the automatic segmentations [13]. Manual annotations were performed for the identification of lipid tissue (LT), fibrous tissue (FT) and background (BG). Fig. 1 gave 5 sample slices with VH-IVUS image slices, VH-IVUS contour plots, OCT image slices, OCT contour plots (expert-drawn), and registered OCT and IVUS slices with OCT and VH-IVUS overlapped. Tiny calcification was neglected in this study.

Machine Learning Segmentation Method
Two segmentation methods for intracoronary OCT image based on convolutional neural network (CNN) and support vector machine (SVM) were performed to detect the lumen borders and characterize the plaque component.

Convolutional Neural Network (CNN) Method Image processing and data augmentation
Due to limited penetration depth of OCT images, the input images of network was truncated from 1024 by 1024 pixels (original OCT data matrix) to 592 by 544 pixels (the truncated data matrix indicated by the marked rectangle) to remove invalid information (see Fig. 2) and save the time of model training. Furthermore, since the power of deep learning models highly relies on the size of the training data, we enhanced existing datasets via data augmentation to train a more robust deep convolutional neural network. In our study, the data augmentation operation contains rotation (the rotation angle θ = 90, 180, 270, 360, 450, 540 degree) and brightness transformation (the parameter is 0.5, 0.7, 0.9, 1.1).

Model architecture
Deeper layers of a convolutional neural network can extract detailed lower-level information from the original image, which is appealing for segmentation in OCT images. The neural network model chosen for this problem was based on the U-Net architecture, which has promising future in the field of segmentation, particularly for medical images [14]. The network can take a full image section as input and then, through a series of trainable weights, create the corresponding section segmentation mask. Focal loss function was used in U-Net model between the true segmentation and the output of our model due to the class imbalance of the proportion that lipid tissue make up compared with the other classes [15]. The original 77 OCT image datasets were divided into eleven test sets, and eleven-fold cross validation was employed, such that when a test set was selected, the rest of the dataset and the data augmentation based were served as the training set. Model selection was made by using the 1-standard-error rule on the validation data set [16]. Five times training and testing were done in this method. All U-Net models were implemented in Pytorch, version 3.6.

Support Vector Machine (SVM) Method Image processing and ROI definition
With the same reason as we explained in CNN method section (limited penetration of OCT), it is not sensible to process the whole OCT image slice. To make the region of interest (ROI), an outline (called the outboundary of the ROI) was generated by using the lumen border (the inner boundary of the ROI) in polar coordinate system and drawing it 1 mm outward from center of catheter. The region of interest (ROI) is the area bounded by lumen and the out-boundary with artery branches and artifacts removed (see Fig. 3, ROI-yellow).

Feature extraction and classification
Local binary patterns (LBPs), gray level co-occurrence matrices (GLCMs) which contains contrast, correlation, energy and homogeneity, entropy and mean value were calculated as features. All features were computed in an 11 by 11 pixels' neighborhood window in ROI [9]. Ten features based on rotation invariant LBPs were calculated with P = 8, R = 1. Twenty-eight features were chosen and combined to improve the accuracy of our classification algorithm which contains 10 LBPs, 16 GLGMs (Four features at each angle (a) Original image.
(b) Magnified Input image. S1 S2 S3 S4 S5 S6 S7 θ = 0 0 , 45 0 , 90 0 , and 135 0 ), entropy and mean value. These selected features which extracted from all pixels in ROI were assembled into a data matrix to feed SVM. The dimension of data matrix is by , where is equal to the number of pixels and is 28, the length of the feature vector. According to the gold standard, three classes are needed to be characterized indicating three different types: lipid tissue, fibrous tissue and background. Multi-class support vector machine classifier was chosen to classify three types with Gaussian Radial Basis Function as the kernel function. In keeping with CNN, the 11-fold cross validation was employed in SVM method. For each group of training OCT images, 500 pixels were random selected from lipid class, fibrous class and background class each to give 105000 pixels. The testing set consists of all pixels in ROI of the rest testing OCT image. Five times training and testing were done in this method. All multi-class SVM models were implemented in Python, version 3.6.

Gold Standard and Segmentation Accuracy
Figs. 4 & 5 gave seven sample OCT slices with our hybrid segmentation by expert (gold standard) and segmentation by CNN and SVM after processing, respectively. The results of segmentation methods were evaluated and compared using the OCT dataset and gold standard at pixel level. Accuracy (Acc), sensitivity (Sen) and specificity (Spe) are defined as follows: (1) where TP is the number of true positive outcomes, FP is the number false positive outcomes, TN is the number of true negative outcomes and FN is the false negative outcomes.

Accuracy of Tissue Region Detection
In order to evaluate the accuracy of tissue region detection, lipid tissue and fibrous tissue were considered as positive patterns (prediction target) while background was regarded as negative pattern. The accuracy of tissue region using two methods are shown in Tab. 1. The overall average prediction accuracy from the 11 sets was 96.20% and 79.54% for CNN and SVM, respectively. The best accuracy was 97.04% for CNN and 82.65% for SVM. Even for the worst case, the prediction accuracy was 95.08% by CNN and 76.58% by SVM. Obviously, CNN provided more accurate results than SVM, especially in specificity. SVM will always underestimate the tissue region as it uses as input the ROI.

Accuracy of Objects Classification
The suggested methods can extract lipid (LT), fibrous tissue (FT), and background (BG) directly from OCT images, and categorize each pixel into its corresponding class. Tab. 2 indicated that prediction accuracy based on CNN was 81.92%, 74.14%, 97.67% and 94.29% for FT, LT, BG and overall, respectively. Prediction accuracy using SVM was 74.46%, 65.33%, 63.59% and 69.46% for FT, LT, BG and overall, respectively. CNN produced (a) Gold standard of SVM method much better scores in all indexes, and the overall accuracy rate of CNN was 35.75% higher than that from SVM. The reason that LT accuracy scores for CNN were lower than scores for other objects is that the percentage of LT pixels in OCT images was lower compared to the other 2 objects (FT and BG), causing training imbalance.

Evaluation of the Discrimination Between LT and FT in Correct Tissue Region Detected
To further evaluate the discrimination between LT and FT in correctly detecting the tissue region, LT and FT were considered as positive pattern and negative pattern, respectively. The results of differentiating LT and FT are shown in Tab. 3. The overall average accuracy (Acc) were 88.76% and 83.36% for CNN and SVM, respectively. The prediction based on CNN presented the best and worst average Acc: 92.82% and 87.01% on Set 5 and Set 4, respectively. While SVM depicted better Sen for LT, it had worse Spe for FT. Overall, both CNN and SVM provided good discrimination between LT and FT.

Signification
Identification of plaque region and segmentation of plaque components in OCT images could provide accurate plaque morphological information which is the basis for clinical diagnosis, image analysis and image-based computation modeling applications. Image-based biomechanical plaque models may be used for researching and quantifying plaque mechanical stress and strain conditions, identifying possible mechanical factors associated with plaque rupture, and assessing plaque rupture risk. Hence, the accuracy of segmentation for OCT images play an important role in model construction and prediction of plaque rupture.
The proposed methods are capable of detecting the plaque region and characterizing the plaque component automatically with high accuracy. In addition, CNN method using U-Net model has a specific advantage over previous segmentation methods in that it is an "end-to-end" method. The automatic segmentation methods proposed by Athanasiou et al. [8,12] needs to detect both the lumen border and outer border before segmentation. Another method proposed by Shalev et al. [9] requires five steps in preprocessing. The proposed CNN method has no production line that requires large image preprocessing and post-processing, resulting in prominent improvements in performance time. Moreover, it is trained and tested on more realistic data which are produced by combining experts' knowledge with virtual histology, the only commercially available plaque segmentation method which is used in clinical practice.

Limitations
Although the proposed study uses a more robust gold standard than the methods presented in the literature, the lock of histology is a drawback. Histological data remain the current gold standard in tissue detection. Histology could provide more effective and additional information which could be used to improve and optimize the efficacy of our methodology. In addition, given the limitation of a small dataset, a large amount of data would be required to improve the accuracy and reliability of our study.

Conclusions
Currently, we presented two machine learning approaches to automatically identify plaque region and characterize plaque compositions in OCT images. The current gold standard is based on a hybrid annotation approach where VH-IVUS was used to guide experts' annotations. Both methods can potentially reduce the time spent by doctors in segmenting and evaluating coronary plaque OCT images. However, it is apparent that the CNN based method can provide better segmentation accuracies compared to those achieved by SVM and is more likely to be used in the clinical and research arena.