Land classification based on high resolution remote sensing images

Land information is an important information related to the timely adjustment of land management policy, which can significantly improve the efficiency of land management. The traditional land information collection completely depends on field survey of grass-roots personnel, which is not only time-consuming and laborious but also can not guarantee the accuracy of information. With the rapid development of remote sensing technology, using remote sensing image to obtain land information is an important means to conduct land investigation and detect whether crop planting area is up to the standard. At present, the method of artificial neural network (ANN) learning classification has achieved great success in the land classification, but there is still room for improvement in the classification accuracy. In this paper, a new Algorithm is introduced-shuffled frog leaping algorithm (SFLA), the important parameters of ANN are optimized. The traditional traversal method is improved to a combination of global optimization and local optimization, which can effectively reduce the search time, increase the generalization ability of the model and improve the classification accuracy. Through the experiment of remote sensing data of Fengqi town, Luochuan county, Yan’an city, Shaanxi province, it is shown that SFLA can effectively increase the accuracy of ANN in land information classification. It has achieved the paper research goal.


Introduction
In the past, the data of crop acreage mainly came from China Agricultural Statistics published by the Ministry of Agriculture, and the other came from China Rural Statistical Yearbook published by the National Bureau of Statistics since 2001 [2]. All this information is based on field research by grassroots personnel. With the new development of space remote sensing technology, new scientific and technological means have appeared, which greatly contribute to the extraction of crop planting area. As early as 1979, Mr. Chen Peng began to advocate the use of remote sensing for grain yield estimation in China. After years of development, the original experiment and demonstration has been developed to the current application of grain yield estimation in the whole country, and a series of successful technical methods have been explored [2].
In recent years, with the development of machine learning and optimization algorithms. Through the combination of machine learning and optimization algorithm, the crop planting area detection method can be improved to achieve accurate and rapid classification of field crops, which is of great significance to carry out remote sensing evaluation of crop production level. In 2008, Manoj Karkee et al. [1] proposed an ANN method to quantify the sub-pixel land use of rice individual types. Artificial neural network is used as model inverter to estimate parameters. The crop area fraction within a pixel is predicted and the seedling stage error is estimated on average. Tian Fufu [3], such as using Sentinel in 2019-2 as a data source, using the method of multi-layer neural network with the extract of soybean, and random forest, such as machine learning, decision tree and support vector machine (SVM), found that the neural network classification results with SLIC object-oriented segmentation after polymerization, the result is ignored the tiny difference of the same plot, and distinguish the crops in different plots, good embodies the distribution of soybean.
In a word, good results can be obtained by using ANN in land classification, but there are very few researches on how to optimize ANN parameters at home and abroad, which results in detachment from reality but only based on experience when setting parameters, resulting in the loss of some precision. In this paper, we study the problem of parameter optimization and find that the SFPA can effectively improve the classification accuracy.

Classification using neural network learning
In 1982, American physicist John Hopfield (Hopfield) published several papers on the study of neural network in the Proceedings of the National Academy of Sciences, emphasizing the application value of neural network. According to the network learning algorithm and function, the neural network can be divided into Bayesian neural network, BP neural network and convolutional neural network, etc. In this paper, BP neural network is chosen [4] .
The feedforward neural network is divided into three layers: input layer, hidden layer and output layer. As shown in Figure 1, this is the basic structure of a typical three-layer neural network. Layer L1 is the input Layer, Layer L2 is the hidden Layer, and Layer L3 is the hidden Layer. The specific functional relationship between them is as follows: (1) Figure 1 Three-layer structure of the neural network In Equation (1), j is the output value of the JTH node in the hidden layer; is the weight between the ITH node in the input layer and the JTH node in the hidden layer; is the input value of the ITH node in the input layer; n is the number of nodes in the input layer, is the threshold value of the JTH node in the hidden layer; is the transfer function In Equation (2), t is the output value of the tTH node in the output layer; is the weight between the JTH node in the hidden layer and the tTH node in the output layer; q is the number of nodes in the output layer; is the threshold value of the tTH node in the output layer.
CISAT  (3) defines the loss function, where O t * is the ideal output value of the t node in the output layer. In order to minimize the output error of the neural network, the classification of the artificial neural network in this study adopts the gradient descent method to fine-tune the network, and the learning rate parameter will be introduced in this process. If the learning rate is set too high, the training model will be accelerated in the early stage, making it easier to approach the local or global optimal solution. However, in the later stage, a large learning rate may directly lead to large fluctuations in training, and even the value of the loss function may hover around the minimum value and fluctuate up and down, resulting in difficulty in reaching the optimal solution. However, if the learning rate is set too small, the learning efficiency will be greatly slowed down in the early stage of the training model, but the final accuracy of the model will be relatively high, but it may also lead to over-fitting and lose the generalization of the model. In a word, the learning rate parameter is related to the accuracy of classification, so this study adopts SFLA to optimize it, so as to improve the accuracy of classification.

Parameters were optimized by using the Shuffled Frog Leaping Algorithm
In 2003, the shuffled frog leaping algorithm (SFLA) was first proposed by Eusuff and Lansey [5]. In order to solve the combinatorial optimization problem, SFLA is a new kind of biological-like intelligent optimization algorithm, which absorbs the advantages of meme evolution based on meme algorithm (MA) and swarm behavior based on particle swarm optimization (PSO). The algorithm has the advantages of simple concept, fast calculation speed, fewer adjusting parameters and easy implementation.
Therefore, we can use the excellent parameter seeking characteristics of SFLA to find the learning rate parameters that can significantly improve the learning classification accuracy of neural networks.

Dataset and Features
The remote sensing image data used in this study came from Google Earth. Google Earth remote sensing image is based on a variety of data fusion images, including aerial images and remote sensing satellite images. The resolution of Google Earth images on the ground is at least 100 meters, and in some insensitive areas in mainland China can reach a resolution of nearly 0.2 meters. As shown in Figure 2, this paper uses Google remote sensing images in Fengqi Town, Luochuan County, Yan'an City, Shaanxi Province, to classify the land information in this area, and to realize the statistics and survey of land utilization resources.

Sample selection and object characteristics
In this paper, five main types of land features are defined in Fengqi town, which are woodland, unused land, cultivated land, building land and grassland. Through visual interpretation combined with Google Earth, 600、599、600、600 and 604 sample data of each object in the research area were selected respectively. Among them, the dark green land like Figure 3 (a) is defined as forestland in this study; the light yellow land like Figure 3 (b) is defined as unused land in this study; the regular green land like Figure 3 (c) is defined as arable land in this study; and the land like Figure 3 (d) is defined as building land. The light green plot in Figure 3 (e) is defined as grassland in this study. The object attributes selected in this study mainly include spectral features, including information of three bands. Figure 3 There are five different categories of land :(a) woodland (b) unused land (c) arable land (d) building land and (e) grassland

Experiments
The key to using the Shuffled Frog Leaping Algorithm is to use the fitness value of each parameter to optimize the parameters. In order to obtain the fitness value, this study adopted the Multilayer Perceptron (MLP) encapsulated in Python's Scikit-Learn module as the core model of the fitness value function. Multi-layer perceptrons are also called Artificial Neural networks (ANN). The study trained the MLP model by setting different parameters. Finally, the accuracy of the model was verified by the validation data set. The accuracy obtained was the adaptation value, and the parameters were optimized according to the adaptation value data. In Figure 4, we can see that the fitness values under the original parameters before optimization presented a scattered distribution, while the scatter distribution of the fitness values after optimization roughly concentrated around the parameters of the optimal solution, and obtained that the parameter of the optimal solution was 0.0017. Figure 5 is a line diagram of the optimization process. With the increase of the number of iterations, the accuracy gradually increases.

Results
In the evaluation of the accuracy of image classification, the confusion matrix method is the most widely used method, which can be used to calculate the overall accuracy, Kappa coefficient, producer accuracy and user accuracy of high resolution remote sensing data information extraction. Table 1 shows the confusion matrix of ANN algorithm classification under the default parameters. The overall accuracy is greater than 80.97%, and the Kappa coefficient is higher than 0.7622. Among them, the accuracy of cultivated land is 72.85% for users, and that of forestland is 74.46% for producers, which is mainly caused by The confusion between arable land and woodland. Table 2 shows the confusion matrix of the artificial neural network classification optimized by shuffled frog leaping algorithm. It can be seen that the overall accuracy after optimization is higher than 85.00%, and the Kappa coefficient is higher than 0.82100. Among them, the user's accuracy is the lowest in the classification of cultivated land, the accuracy is only 76.04 %, mainly caused by the cultivated land is misclassified into woodland and grassland, the producer's accuracy is the lowest in the construction land is 74.50%, mainly caused by the construction land is classified into unused land.