Classification of polycystic ovary based on ultrasound images using competitive neural network

Infertility in the women reproduction system due to inhibition of follicles maturation process causing the number of follicles which is called polycystic ovaries (PCO). PCO detection is still operated manually by a gynecologist by counting the number and size of follicles in the ovaries, so it takes a long time and needs high accuracy. In general, PCO can be detected by calculating stereology or feature extraction and classification. In this paper, we designed a system to classify PCO by using the feature extraction (Gabor Wavelet method) and Competitive Neural Network (CNN). CNN was selected because this method is the combination between Hemming Net and The Max Net so that the data classification can be performed based on the specific characteristics of ultrasound data. Based on the result of system testing, Competitive Neural Network obtained the highest accuracy is 80.84% and the time process is 60.64 seconds (when using 32 feature vectors as well as weight and bias values respectively of 0.03 and 0.002).


Introduction
Fertility is one of the factors that may affect the integrity of a household. While infertility is a problem that occurs in the reproduction system of women or men. Inhibition of follicular maturation process can interfere with women ovulation, causing the number of follicles called polycystic ovaries (PCO). PCO symptoms can be detected early through hormone examination. Because such examinations are very expensive, many people turn to ultrasonography (USG) examination that produces an ultrasound image as shown in Figure 1.
According to the Rotterdam conference [1], a patient can be said has PCOS if she meets two of the three symptoms: (1) failure of ovulation, (2) high androgen hormones, or (3) the presence of polycystic ovaries circumstances. Morphologically, it can be said if the presence of polycystic ovaries are twelve or more follicles 2-9 mm in diameter either and or the occurrence of ovarian volume increase by more than 10cm 3 [2]. Ultrasound image is checked manually by a doctor by counting the number and size of follicles in the ovaries. However, the investigation takes a long time and needs high accuracy to detect whether the patient has polycystic ovary syndrome or not.
The problem can be overcome by using an automated system that can detect PCO by ultrasound images. In general, the follicles can be detected using a system with machine learning approach (feature extraction and classification) and stereology approach. This research used machine learning approach with Gabor Wavelet method for feature extraction and Competitive

Related Works
Follicles detection can be done in two ways, namely stereology and a sequencial process of feature extraction and classification. In the paper entitled Follicle Ultrasound Detection on the Images to Support Determination Polycystic Ovary Syndrome [3], polycystic ovary can be identified using stereology to calculate the number and size of each follicle and euclidean distance to measure the diameter of the follicle. , follicles can be segmented using region growing method. This method tests whether the neighbor of initial seeds should be added to segmentation region. In the previous research, the segmentation process of ultrasound image has been done in [9, 10] that uses edge detection as segmentation process. In [9], a median filter can be used to remove noise form image. It is based on storing and updating the gray level histogram of the picture elements in the window. The main idea of the median filter is to find a median in a specific window. The center window will be updated with a median of the window. Meanwhile, otsu global threshold [10] of the image to is the way to find the pixel similarity to its neighbors. Otsu's thresholding method involves iterating through all the possible threshold values and calculating a measure of spread for the pixel levels each side of the threshold, i.e. the pixels that either fall in foreground or background. As well-known, Canny has introduced an edge detection method in [11]. In the Canny edge detection, a computational approach to edge detection was presented which is using a smoothing process to the image with a Gaussian, optimizing the trade-off between noise filtering and edge localization. It can be used to detect follicles in the ultrasound image which is diagnosed PCO.

Proposed Schema
The data used in this research is ovarian ultrasound image derived from Klinik Bersalin Permata Bunda Syariah, Cirebon and the data has been validated by a gynecologist. Based on Figure 2, the ultrasound image has four step process. First, preprocessing on the image has been done, then segmentation has been performed to obtain follicle and feature extraction by using Gabor Wavelet. The final process is classifying the follicle using Competitive Neural Network to get the final result is "1" or "2", where "1" is a non-PCO and "2" is PCO. Figure 2 describes general scheme of PCO detection system.

Figure 2. System Flowchart
Preprocessing is an important step before the data is processed. This stage will produce data that has better quality and important information contained in the image to be more easily obtained. Preprocessing has several processes, namely grayscalling, histogram equalization, image binarization, morphology, invert image, and data cleaning. Grayscalling is used to convert color images into grayscale images. Histogram equalization is used to capture the difference pixels that have a low frequency. Output obtained in this process is an image that has the same histogram values so that the important information contained in the images are not lost. Binary image is a process to change the image matrix values to 1 or 0, which means the image turns to black and white. Morphology used are erosion and dilation. Erosion is used to attenuate objects that are considered noise in binary image so that the object will disappear. Then the dilation process is used to thicken binary image objects that have been lost because of the erosion process. Invert image is a process to change black into white and vice versa, because the object is more easily detected when the object's color is black and the background is white. The final is data cleaning process to remove the objects that are not important.
Segmentation is the step to separate an object with the background. This step has two processes, namely follicle edge detection and cropping. Edge detection aims to recognize objects contained in the image. In this research, the object is the follicle and edge detection conducted using modern algorithms (canny method). Furthermore, each follicle has been detected will be labeled and every labeled follicle will be cropped to produce a new image that will be used in the next process.
Feature extraction is the step to extract feature/information of the object in an image to be distinguished from other objects. This research uses a Gabor Wavelet feature vector which results from each follicle. Figure 3 describes the flowchart of Gabor Wavelet process.
Classification is the last step that will be used to classify the data into several classes. In this research, the data will be classified into class "1" which means non-PCO or class "2" which means PCO. The method used is Competitive Neural Network (CNN) which has a single layer neural  Figure 3. Gabor Wavelet Process network architecture. This neural network only has input and output layer. CNN learning systems using unsupervised learning method which is no target output. CNN was selected because this method combined Hemming Net and The Max Net, so data classification can be performed based on the specific characteristics of ultrasound data. The process of learning on CNN using the WTA (winners take all) systems. The whole neurons that exist on CNN will be competing and they will produce only one winning neuron. The training and testing process of competitive neural network can be seen in Figure 4. CNN classification consists of two processes, namely:

Training Process
The training process is done by using a training dataset and started by initializing the entire weights randomly. Then, performed two CNN training's phase that are done repeatedly weight value. Both phases are:

Competing Phase
In this phase, all neurons will compete to be a winning neuron in rewarding phase. The process done in this phase are: (i) Choosing the input vector randomly from the entire data.
(ii) Calculating output on all output nodes based on the following formula: with an output nodes o, p is input vectors, and w is the weight vector of all the input vector. (iii) Determining the winning neurons of all output nodes.

Rewarding Phase
The purpose of this phase to give a reward to one neuron that has been won in competing phase. Weight vector in the neurons will be updated based on the Equation 2 so that the winner can be closer to the input vector: by w is the weight vector of all the input vector, ξ is a competitive learning rate (bias), and p is input vectors.

Testing Process
This process used a testing dataset and it was conducted testing using hyperplane of the training process to get the classification of follicles detected "1" (non-PCO) or "2" (PCO).

Testing of Weight Parameter
This test aims to analyze the influence of weight parameter on accuracy. Weight parameter is used in the competing phase to obtain neutal input. There are 10 variations of weight parameters used are 0.01, 0.02, 0.03, 0.04, 0.05, 0.1, 0.15, 0.2, 0.25, and 0.3. The best accuracy is 80.84% with a weight of 0.03, 0.15, 0.2, and 0.25 at 32 feature vectors. Figure 5 shows that the accuracy obtained from each weight parameter produce value almost stable at 16, 24 and 32 feature vectors.   Based on Figure 7, the best accuration is 80.84% at 32 feature vector. The number of feature vectors in Gabor Wavelet process is directly proportional to the accuracy of PCO classification. The greater number of feature vectors, the greater accuracy obtained to perform PCO classification based on ultrasound image.

Conclusion
Classification of polycystic ovaries can be done by using machine learning approach (feature extraction and classification). One of the classification method is Competitive Neural Network. The best accuracy is 80.84% when using 32 feature vector (time process is 60.64 seconds). It can be happened because the number of feature vectors is directly proportional to the accuracy of classification using Competitive Neural Network.