Fish Species Classification using SVM Kernels

- This paper presents classification of fish species using support vector machine (SVM) algorithm with four kernel functions such as linear, polynomial, sigmoid and radial basis functions. The datasets for performing this research is obtained from Fish-Pak website which has required number of images for classifying the two different fish species namely Catla and Rohu with three fish features like head, body and scale data. The number of images for Rohu fish species is not equal to the Catla type fish species therefore image augmentation technique is used to balance the number of images. The simulation results reveal that SVM with radial basis function-based kernel provides the accuracy of 78 %.


Introduction
Geographically, oceans occupy around seventy percent of the earth surface.In case of marine eco-systems, different variety of fishes are distributed in the ocean for balancing the nature.Approximately 22,000 fish varieties are available which is the most available species reported in the living vertebrate's category [1].Therefore, study of fish varieties are essential due to preserve and protect the marine biology and aquaculture.
Identification of underwater fish in the pond or freshwater are needed by marine biologist, researchers and scientists who are carrying out the research in observing the behavior of different kinds of fishes.Further, it has more demand among the tourists.Fish farms are greatly relying on monitoring the different kinds of fishes as well as its habitation for propagating same type of fishes and study about its life cycle.Environmental and climatic conditions are highly influence on the fish species and its habitation.Nowadays, Fish classification is done by manually and it leads to time consuming as well as needs a much more efforts to obtain the samples in diversified environments [2].
The data related to scattering of fish species within the region is the most important for observing the current status and healthiness of the fish accumulations and in specific type expected by the fisheries.The fishes and their habitats are in more pressure owing to global changes occurring in the oceanic atmosphere [3,4].For periodic monitoring, the standardized reliable and cost-effective method is very much important for predicting the fish species across the penetrations as well as habitats [5].Fish classification by manual methods involve capturing of sample clusters as well as diver's visual inspection are more commonly used methods till now [6].
The behavioural study of fish species can be done through automation by means of obtaining the visual feedback from the inspection carried out in the numerous locations and making the visual fish classification in automated process which will provide significant quantity of data for pattern identification.Even though, many advance techniques for classification of fish species from the captured images/videos in underwater [7][8][9][10] or tanks with artificial lighting conditions have been reported in [11].The literature review reveals that the videos or datasets taken from the underwater conditions have not been produced the efficient way of classifying the fish species.These issues are made the researchers to identify and recognize the fish varieties becomes a challenging task.The major challenges confronted to classify the fish species in underwater conditions are such as water murkiness, poor background lighting conditions, jitter in captured images, occlusions, same type of textures as well as structures among the variety of fish types which creates less accuracy [12].
Fish classification comes under multi-class classification problems which is the fascinating research area of computer vision and machine learning algorithm.The algorithms reported in [13][14] classify the fish species depending upon the texture and shape feature extraction as well as matching.

Related works
Recently, a variety of machine learning algorithms are reported to classify the underwater species.Storbeck et al. [15] utilized the laser light for predicting the main features like width, thickness and length of the different types of species by means of three-dimensional modelling of fish species.Unhindered way of classifying the fish species is really a stimulating task where background confusion among aquatic plants, reef, changes in lighting conditions and water turbidity is feasible.The same kind of texture, shape and color of the different variety of fish species is to be considered as alternate challenging situation arises to obtain the accurate classification of fish species.The researchers investigated in [16 -17] for classifying the fish varieties in two kinds of conventional way depend upon the texture and shape forms under uncontrolled and natural environmental conditions.
[18] implemented the fish species identification method by means of recorded videos in unrestricted environments depending upon the enormous biomass availability.The researchers presented a work [19] to classify the fish species automatically by utilizing the features such as geometry, texture and morphology with Artificial Neural Network (ANN) algorithm.Huang et al [20] 46532/978-81-950008-1-4_102 varieties of the fishes with speckle pattern features and scale fishes.Huang et al. [23] proposed to predict the fish species from the 24000 image datasets of 15 different kinds of fish varieties with the prediction accuracy of 74.8 % by means of SVM with Gaussian mixture model.
[ 24] proposed the method which employs kernel descriptors and efficient match kernels as features of fish and multi-class based SVM classification algorithm for training to carry out the fish prediction with the accuracy value of 84.4% from 50000 images of 10 varieties of fishes.[25] presented their research work to classify the fish species using warping methods of images before performing the classification through SVM algorithm with 320 images of datasets to attain the accuracy value of 90 %. [26] reported the work pertaining to fish species classification with shape and texture features in which Euclidean Distance method and Artificial Neural Network were employed and produces the recognition accuracy of 99 % and 81.67 % respectively.[27] utilized the SVM algorithm considering the shape of fish as features with training and testing image datasets of 76 and 74 respectively with the prediction accuracy value of 78.59 %.In this research work, SVM with 4-different kernel functions viz linear, polynomial, sigmoid and RBF to classify the fish species.

Support Vector Machine (SVM):
SVM is considered as part of machine learning, neural network, pattern recognition and data mining.The major concept of SVM technique is depicted in Fig. 1 that can be observed as the function of dividing the two classes namely true (positive) or false (negative) in feature space.The most common problem arises to figure out a hyperplane which exactly divides the classes in efficient way based on maximum margin.The given datasets to be distinguished as positive or negative in which the identifying the hyper-plane that separates those datasets perfectly.This method is one of the kernel-based techniques which has features like better generalization and higher computational power for regression and classification problems [28] and provides exact results due to solid theoretical background in comparison with other algorithms.Figure 1, shows the SVMs employs a maximum margin separator [29] which indicates about the most remote control based feasible solution to the sample point.This technique can also be used to classify the nonlinear type of data by means of transferring the data to a level of higher size using kernel trick method.In general, the data which cannot be divided linearly in the original input space that can be individually separated in higher dimensional space feature.This algorithm is used for solving the complex problems and restrict to overfitting features.The authors reported [30] that SVMs can be employed for binary classifications which gives more accurate classification results possible with less amount of sampling data.This algorithm dedicatedly designed for binary classification which has been improvised for classifying the multiple-class problems with non-linear type of data.
The kernel function based SVM method is used to classify the 2-D classification from 1-D datasets.In case of kernel-based function, lower dimensional space data is projected as higher dimensional space data.The different kinds of kernels are such as linear, polynomial, sigmoid and radial basis function (RBF) which is used for decision making.

Linear Kernel
This kernel function is considered as straight forward function which is employed generally whenever the numerous features available in specific datasets can be separated with single line.This method along with SVM works faster compared with other methods of same type.The linear kernel function is expressed in Equation (1) where, i and j denotes input space vectors, C stands for constant proportionality

Polynomial Kernel:
In some cases, separation of datasets in linear fashion is not feasible solution owing to disturbance (noise) in the datasets.In order to provide solution for these kinds of problems, map the datasets into different type of feature space to make the datasets to be linearly separable one.This kernel function is widely used with SVM in machine learning algorithm as well as other kernel-based models which denotes the same type of vectors available in a feature space polynomial of the original variables that provides learning for non-linear based models.The mathematical expression for this kernel function is represented in Equation (2) where, C denotes the free parameter which trades off the influence of lower-order values vs higher order values and its value greater than or equal to 0.

Sigmoid kernel:
This type of kernel arrived from neural network-based research field in which bipolar sigmoid based activation function is commonly used for artificial neurons.The SVM algorithm with sigmoid function is correlated with two layers of perception based neural network.The mathematical expression for this function is represented in Equation ( 3) where, γ denotes the input data scaling factor and C corresponds to shifting parameter which is used for controlling the threshold in mapping.

RBF kernel
This kernel function is also referred as Gaussian kernel which is one of the popularly used kernel functions in machine learning algorithms and other kernel-based techniques which is generally preferred for SVM classification.The governing mathematical equation for this function is given in Equation ( 4) where, γ stands for spread parameter.

Experimentation and Result
The block diagram of the fish species classification using SVM technique is depicted in Figure 2.

Image augmentation
The data augmentation technique is used to increase number of data sets used for training the model.To facilitate the reliable prediction of the images, the machine learning algorithm requires plenty of datasets for training purpose.However, this is not feasible for all the situations.This can be done through data augmentation technique which converts the available datasets into required number of datasets to get better results.

Result and Discussion
The efficiency of the kernels can be found out through two parameters such as C and gamma.The function of C parameter is used to control the trade-off among the classification of training datasets accuracy and a decision boundary.Gamma parameter deals with how far-off to attain the single training value.In this research work, four types of kernel functions namely linear, polynomial, sigmoid and RBF.The cost value c=0.001 is set for linear kernel and for RBF kernel c= 0.01,gamma= 0.1.Similarly for sigmoid gamma= 0.001 and for polynomial parameter d=3.

Linear Kernel
The results of fish species classification are presented in the form of confusion matrix as shown in Figure 5 for SVM with linear kernel.

Conclusion
This research work presents the fish species classification based on SVM with four kinds of Kernel function.The RBF based kernel with SVM technique gives the maximum efficiency of 78 % compared with other kernel functions namely linear, sigmoid and polynomial functions due to its unique features.This study can be extended to enhance the efficiency and accuracy with different types of algorithms as well as large datasets for recognition of fish species.
demonstrated to classify the live fish available in the open sea by means of hierarchical classification technique.The fish varieties identified through low-resolution images reported in [21].The morphological techniques as well as filters-based method proposed in [22] to categorize the @ IJAICT India Publications 2020 M.G.Sumithra et al.(eds.).Advances in Computing, Communication, Automation and Biomedical Technology, https://doi.org/10.

Fig. 2 :Fig. 3 :
Fig. 2: Block Diagram of Fish Species classification using SVM classifier Fish image dataset The datasets for this study are taken from the Fish-Pak website.These datasets have 2 types of fish varieties namely Catla (Thala) and Labeo rohita (Rohu) images taken from the famous fish farming areas across the globe including Pakistan.In this study, 120 images of Rohu fish variety taken from different places of the rivers situated in Punjab, Chenab and Pakistan by manual mode.Similarly, images of Thala fish variety captured from Punjab, Sialkot and Marala head works.Figure 3 illustrates the sample images of Catla and Rohu fish variety taken from the datasets.

Table 1 :
To make both the datasets are in equal value of 200, data augmentation technique such as rotations and flipping are employed and once the data augmentation technique gets over.Figure4, shows the sample images after applying rotation and flipping techniques.In this study, 80 % of data is used for training and 20 % for testing purpose which is provided in the Table1.Number Of Datasets After Image Augmentation