Predictive analysis on hyperspectral images using deep learning

. In the field of remote sensing, identification of hyperspectral images (HSI) has become a trending topic. Hyperspectral visualization also struggles with a non-linear relationship between the spectral data obtained and the actual material in the image. In the recent years new machine learning algorithms have been established as an efficient feature extraction method to tackle non-linear hitches efficiently and commonly cast-off in a variety of image classification tasks. In addition, deep learning was applied to identify the features of the HSI and demonstrated good performance inspired by various positive applications. This research paper provides us with an ordered review of deep learning Hyperspectral image classification literature available and it compares multiple propositions. Exclusively, we summarize the main HSI classification challenges that conventional machine learning approaches have not been able to solve successfully, and also incorporate the benefits of deep learning to address these issues. This study improved data abstraction with minimized uncertainty and enhanced HSI classification performance. Firstly, a CNN model is built to understand the HSI‟s spectral functionality. The CNN is used as a pixel classifier; therefore, it operates only in the spectral domain.


Introduction
Hyperspectral image classification is the process of classifying every pixel of the image captured using the spectral sensors. In this paper we present the principle of Deep Learning into the classification of hyperspectral image dataset. The hyperspectral images capture the spectral and spatial information from a distant. Spatial information has been taken into account in recent years and some spectral-spatial classifiers have been proposed, and these methodologies offer significant benefits in enhancing quality.
The detailed spectral information helps in increasing the accuracy of distinguishing the materials that cover the surface of the earth. On the other hand, we have the spatial resolution data that plays a major role in wide range of claims such as rural-urban progress, monitoring the movement of land etc. We thus obey the classical spectral classification based on information. Secondly, a latest approach is brought forward to distinguish hyperspectral objects with spatial-domain knowledge. Then a deep learning structure is proposed to amalgamate the binary features, by which we gain the absolute precision of classification. The structure is a composite of principle component analysis (PCA); this is one such methodology under Convolutional Neural Networks, machine learning model, and LR (Logistic Regression). Distinctively, in the deep learning construction and stacked autoencoders (SAE) help in achieving efficient and elevated features.
In the past two decades, numerous HSI classification techniques have been put forward. To be very specific, various administered deep learning algorithms have been explored for the Hyperspectral image classification. Despite the fact that unsupervised class only depends upon the data to category size, every pixel in the supervised identification of the scene is typically much more precise as the spectral and spatial data are abundant. Typical classifiers such as linear SVM and LR could be related to single-layer classifiers, whereas decision tree or SVM with kernels should have two layers The shallow classifiers primarily divide into two processes: feature extraction and classification. Bearing in mind the deep learning techniques of image classification, classifiers such as Support Vector Machine (SVM) and logistic regression could be credited to single-layer classifiers, whereas SVM (decision tree) is said to have a double layer. When established in neuro-science, Because of their multiple stages of retina-cortex fusion, human brains do well in tasks such as object recognition. Like-wise Deep learning models through multi-layers of extract processing are more complex, conserved information features and are therefore considered to be capable of achieving better classification precision than standard, shallower classifiers. Such deep models are often used to produce positive results in many fields, including tasks related to identification or image classification.
Following sections of the paper will contain the literature survey where we learn the different algorithms used to classify hyperspectral images followed by methodology and the proposed model. In the data description section we see the three different datasets that we have taken for this project.

Literature Survey
The senior members of IEEE namely, Shutao Li, Weiwei Song and Pedram Ghamisi, have taken the dataset of Indian Pines, Salinas and Pavia University for classification of Hyperspectral images. In this paper we learn that the survey papers utilize deep CNN i.e. both 2-D CNN and 3D-CNN which assures better performance in classification when compared to the other traditional methods using for remote sensing. The traditional methods include the CNN and SVM-based ensemble methods. The framework proposed is less complex but effective. The weights of image are fed into classifier to instantiate the training and testing process. The Houston data set was the first to be experimented on, and the Gabor CNN provided much detailed edges as the end result. The accuracy rate after the classification is given in the below section.
[1] Jon Atli Benediktsson, Yushi Chen and Pedram Ghamisi applied the various deep models that are most likely used for classification which include Stacked Autoencoders (SAE), Deep Belief Networks (DBN), Convolutional Neural Networks (CNN), and Recurrent Neural Network (RNN). Such deep learning architecture networks are classified into spectral networks, spatial features and spectral structural features that extract data categorically. The data set taken is the Salinas, Indian Pines and Pavia University data set. In this paper it is concluded that the classification accuracies are better while using deep learning models than the non-deep learning methodologies. [2] Lin Zhu, Kaiquiang Zhu and Pedram Ghamisi in their paper "Automatic Design of Convolution Neural Network for Hyperspectral Image Classification" mainly implemented two automatic HSI classification techniques on the Indian Pines. 1-D Auto-CNN and 3-D Auto-CNN are put-forth as classifiers of spectral and spatial features. They perform an optimization technique which is reliant on gradient descent [3]. Deren Li discussed about the fusion of change detection with remote sensing. In the paper we learn that change detection is a complex process involving multiple factors. Change detection "focuses on the linear area features and terrain features. They are separated using Digital Line Graphs (DLG) and Digital Elevation Model (DEM). After the advent of segmentation techniques researchers have employed different methodologies [4]. 3 Tong Li, Junping Zhang and Ye Zhang in their paper, implemented classification with spatial-spectral information on the dataset. The deep belief network (DBN) shows best performance in the complete precision among the other tradition methods, though SVM had better result in some ground cover recognition [5].
The main aim of this paper is to advance a domain-adaptive learning method based on MOGP to generate feature descriptors for image classification [6]. In the remote sensing world, hyperspectral image classification problem is well known. The proposed methodologies are DBN and SAE [7]. The students of the Zhejiang University detected a subtle technique to derive accuracy from image classification in a more efficient manner. Pixel-wise spectral information was extracted and preprocessed [8].
In the process of image classification, feature extraction becomes a critical procedure as it aims to improve the domain-adaptive mechanism. The pixel-wise classification model was built using Principal Component Analysis (Quantitative study), SVM, Logistic Regression (LR) followed by CNN [9].
The CNN has real-time detection potential, as it has a short detection time. Most of the CNN based models have got best performances with most accuracies exceeding 85%.Miayong Zhu and Zhou in their paper established CNN models unable us to identify the different types soybean variety using the Hyperspectral images. In their survey, they have connected stress resistance ability with the variety of the soybean. A convolutional neural network (CNN) makes use of the regular pixel-wise spectra of various figures of soybeans. The CNN models resulting in a good performance in predicting the variety of soybean, with the increase in the soybean number, the correctness was improved [10]. Spatial Feature Learning

Methodology
The widely used classification algorithm in Hyperspectral image studies is SVM because of its effectiveness, performance, and ability to handle high-dimensional feature space, where number of features is considered very high. It"s a supervised deep learning algorithm which can be used for either classification or regression challenges. Within SVM, the pixel is graphed as a point in r-dimensional space, the significance of each element being the value of specific arrangements. Instead, by finding the hyper-plane that differentiates the two categories very well, conduct identification. Naive Bayes is a classification strategy based on theorems with a concept of independence within similarity determinants. Simply put, a Naive Bayes classifier claims that the presence of a particular characteristic in a class is not linked for any other characteristics. It is not a single algorithm but a group of algorithms where all of them share a same principle, i.e. every pair of characteristics being classified is independent of each other.
K-Nearest Neighbors is amongst the most basic yet essential classification techniques in machine learning. It is included in the supervised learning field and finds intensive use in pattern recognition, data mining and detection of invasion. It is readily available in real-life situations as it is non-parametric, which means that it does not make any fundamental assumptions about the spread of data. We are provided few prior data (also known as training data) that classify coordinates into groups defined by an element.
The KNN algorithm presumes that similar things exist in close proximity. In other words, identical things are near to each other. CNN is so popular because constructions of CNN classifiers requires domain awareness or setting of parameters, and are therefore ideal for exploratory information exploration. Table 2. Shows the layers in order, according to the proposed Hybrid CNN architecture with a window size 25X25 with total trainable parameters: 5, 122, and 176.

Proposed Model
The rudimental input image is taken as I with the width as M, height as N and the number of spectral band as D. All the pixels in the image taken I consists of D spectral degrees to form a one-hot label vector Y = (y1, y2, yc), the different categories in the land cover masses are denoted by C. Moreover, the hyperspectral variables portray the mixed land-cover classes, incorporating high bilateral-class variance and resemblance among classes into the image I. To start off, first the spectral redundancy has to be removed using Principal Component Analysis (PCA).
This traditional algorithm is applied to original Hyperspectral Image I and also the spectral bands. After the application of PCA the number of spectral bands reduces from D to F while still retaining the same spatial dimension. The image that is derived after applying PCA is denoted as X. The HSI data frame is fragmented into small overlapping 3D-patches for the exclusive use of image classification techniques, the central pixel label of the image determines the truth table.
The convolution comes about by measuring the sum of the dot product between both the data input and the kernel. The input image is convoluted with 2D kernels in the 2D Convolution Neural Network is performed with the activation value (x, y). To cover the maximum spatial dimension the kernel is stridden over the input image. The convolution layer feature maps are created in the model planned for HSI data that uses the 3D kernel over multiple continuous bands in the input layer; this gathers the spectral data needed to predict the image. The activation value of spatial data is (x, y, z) for 3D convolution. The CNN [12] comprises of various criterion such as the kernel weight and bias. The dataset is trained using both unsupervised and supervised learning method. Traditional methods such as SVM and gradient descent techniques used for optimization To increase the number of spectral-spatial features concurrently, 3D convolutions are applied several times. It is observed that the first dense layer has the maximum number of parameters when compared to the number of parameters in the last layer.
Therefore the sum of features proposed in the model depends on the data taken. For the Indian pines dataset, all the parameters are prepared and trained using the Deep learning mechanism of back propagation algorithm. Table 3. Shows the summary of the layers in Indian Pines dataset proffered spectral network architecture having a window size of 25x25.

Dataset Description
In this work, we have taken three datasets that are publicly accessible hyperspectral image data, comprising the Indian Pines, Pavia University and Salinas Scene. The Indian Pines dataset which was gathered through an AVIRIS which stands for Airborne Visible/Infrared Imaging Spectrometer sensor above the Indian Pines test site. The dataset of Indian Pines (IP) has images with 145 x 145 spatial dimensions and 224 spectral bands, and the available ground truth is divided into 16 vegetation groups. This dataset has many training and testing samples. Implemented various algorithms to derive different effects of using varying window sizes and the effect of PCA. Pavia University dataset is taken through a ROSIS which stands for Reflective Optics System Imaging Spectrometer sensor. Their dataset comprises of 610 x 340 spatial pixel dimensions along with 103 spectral bands. Further, the ground truth is divided into 9 fragments of land cover masses. The third dataset i.e. the Salinas Scene was taken with the help an AVIRIS sensor from above the Salinas Valley which is located in California. This dataset consists of  All the work pertaining to this project was implemented on Dell Inspiron laptop with a RAM of 16 GB. Table 4. Shows the distribution of training and testing samples as well as the measurement of Overall Accuracy (OA), Kappa coefficient and Average Accuracy (AA) achieved by HybridSN and other methods on the Indian Pines dataset.

Result and Discussion
In this paper, the final results obtained are correlated with the traditional methods of classification such as the widely used classification algorithm in Hyperspectral image studies is SVM because of its effectiveness, performance, and ability to handle high-dimensional feature space, where number of features is considered very high. It"s a supervised deep learning algorithm which can be used for either classification or regression challenges.
Within SVM [11], the pixel is graphed as a point in r-dimensional space, the significance of each element being the value of a specific arrangement. Instead, by finding the hyper-plane that differentiates the two categories very well, conduct identification. The parameter taken into consideration such as Overall Accuracy (OA), Average Accuracy (AA) and the Kappa Coefficient (Kappa) is used to define the precision of the algorithm for classification.
These three parameters will help us evaluate and predict the Hyperspectral Image performance. OA represents the total of correctly categorized samples from the overall test samples; AA gives the average accuracy of classification; and Kappa is a statistical calculation metric that offers reciprocal knowledge on a clear consensus between the map of ground truth and the classification map.  The results obtained in the terms of the parameters are shown in Table 4, i.e. Overall Accuracy, Average Accuracy and Kappa Coefficient for different classification algorithms as they produce better accuracies. The dataset is bifurcated into 70% for training and remaining 30% for testing. It is observed that the 3D-CNN algorithm outperforms the other methodologies. Table 5. Shows the results when the entire dataset is split into 10% for training and the remaining for testing. The influence of spatial dimension on the Spatial Network model performance is mentioned in Table 6. Where IP denotes the Indian Pines dataset. PU denotes the Pavia University dataset and SA denotes the Salina Scene"s dataset. The 25 × 25 window size in spatial dimension used was found to be optimum for the proposed method to obtain a good score of accuracy. From these results it is also observed that 3D-CNN performance over Salinas Scene dataset is lower than 2D-CNN. This CNN approach eases the overfitting phenomenon and thereby improves the 10 classification results. However, this procedure is time consuming and it does not require human assistance. For the proposed method, the precision and loss convergence for 100 epochs of training and validation sets is shown in Figure 2. It is observed that the model that is proposed is converged approximately in 50 epochs which directs us towards quicker convergence as this result enables us to get a better understanding of the dataset and in turn the Hyperspectral image. In this experiment it is found that the output of each model decreases significantly, while in almost all cases the proposed model remains capable of surpassing other models.

Conclusion
As discussed in this survey, predictive analysis on HSI images has been conducted for many years. It is proved that only after the evolution of deep learning models, we achieve better classification accuracies. Deep Learning technology is the most simple and efficient way to predict Hyperspectral Images provided by the Geographical Information System. In the above survey various experiments were conducted to validate and equate the effectiveness of different strategies and among all the techniques available such as SVM, SSRN, and 2D-CNN, 3D -CNN were used. Deep CNN model gives the good results. Learning Algorithms such as decision tree, support vector machine and the naïve Bayes produce more accurate results. But how ever among all Hybrid SN gives the best accurate result.