1. Introduction
As one of the three major grain crops in China, wheat occupies an important position in the daily life of the people and the national economy [
1]. Accurately identifying winter wheat from high-resolution images requires remote sensing technology and convolutional neural networks [
2,
3]. This is paramount to winter wheat spatial distribution accuracy and ensuring grain yield [
4].
Convolutional neural networks (CNN) can automatically extract abstract features and reduce the number of parameters in the neural network during the deep propagation process [
5]. It is widely used in image recognition [
6,
7,
8,
9,
10], object detection [
11,
12,
13], semantic segmentation [
14,
15,
16,
17,
18], remote sensing scene classification [
19,
20,
21], image denoising [
22,
23,
24] and other fields. The extensive application of convolutional neural networks in computer vision signifies that their techniques and methods can be effectively employed in remote sensing [
25,
26] image automatic interpretation. The use of image semantic segmentation has become an important technology in order to extract the data regarding crops from high-resolution images. For example, Zhang et al. [
27] constructed a hybrid structure convolutional neural network (HSCNN) with two different depth components based on CNN which was used to extract winter wheat planting information in the study area. Compared with SegNet [
28] and DeepLab [
29], their method improves accuracy by 0.147 and 0.059, respectively, and achieves good results. Chen [
30] constructed a multi-scale feature convolution neural network (MSFCNN) model to extract winter wheat planting information in the study area. Furthermore, Wang et al. [
31] proposed a method to improve the accuracy of image edge pixel extraction, thereby improving the classification accuracy of winter wheat. This approach used a partially connected conditional random field model (PCCRF) to refine RefineNet [
32] classification results. The results showed that this method effectively extracted winter wheat spatial distribution information. In addition, Teimouri et al. [
33] developed and implemented a lightweight network structure. This network structure is based on deep learning and combines a fully convolutional network and a convolutional long short-term memory network (ConvLSTM). The authors applied this network structure to the spatial information extraction of 14 crops in their study area. The average pixel-based accuracy and Intersection over Union obtained from the proposed network were 86% and 0.64, respectively, and achieved good results. In remote sensing image semantic segmentation, the encoder–decoder structure similar to that of the UNet networks to stitch low-level and high-level features achieved better results.
However, the current research primarily relies on Sentinel and Landsat data, with limited utilization of Chinese GF data. This is mainly due to the relative difficulty in accessing Chinese GF data. Chinese GF data is typically controlled and managed by the government or relevant organizations, requiring specific application procedures or appropriate permissions to access such data. The spatial resolution of GF-2 data in China can reach the sub-meter level. The GF-2 PMS data can provide optical images at a 0.8 m resolution, which greatly exceeds Sentinel (10 m) and Landsat (30 m) capabilities, thus providing information on winter wheat, e.g., texture and field shape. However, there are few public datasets of GF-2 winter wheat for training high-resolution images, and traditional machine learning methods still need to extract features artificially. As a result, only shallow features can be learned when training on high-resolution images, which affects the final target recognition accuracy. Furthermore, due to the different planting area sizes of winter wheat, there are different patch sizes in remote sensing images, which causes unbalanced samples in the dataset. Determining ways to construct a high-resolution remote sensing image dataset of winter wheat and use a convolutional neural network to accurately identify winter wheat information on high-resolution images are also problems to be solved.
This paper proposes a SUNet (Shuffle Attention UNet) network model combined with an attention mechanism. The primary purpose is to solve the problem of unbalanced sample datasets and high-resolution winter wheat extraction tasks. In order to make the neural network pay more attention to important features, a batch normalization layer is added to improve the target recognition accuracy after layer convolution. In the decoding stage, the underlying features of the encoding stage are optimized to strengthen the network model’s area of interest. Then, this paper uses the focus loss function to calculate the loss value. This can alleviate the problem of sample imbalance. Thus, the classification accuracy of winter wheat can be improved. In addition, the data enhancement strategy can be used to expand the winter wheat dataset during training. This makes the network model more robust, and it also avoids overfitting.
In summary, the objectives of this study are as follows:
- (1)
Construct a high-resolution winter wheat public dataset based on the China GF-2 satellite. The dataset contains six bands of RGB, near-infrared, NDVI and NDVIincrease, and has rich image samples and labeling information.
- (2)
Propose the SUNET network model, which introduces the Batch normalization layer and the Shuffle Attention mechanism. The results of the comparison test and the ablation experiment show that the generalization ability and classification accuracy of the SUNET model have been improved.
5. Conclusions
This work studied the recognition of winter wheat based on a convolutional neural network in the cloud environment. This paper improved the UNet model, proposed the SUNet model, and used the AI Studio cloud platform to study the application of convolution neural networks in winter wheat recognition on GF-2 high resolution. The conclusions are as follows: the high-performance environment equipped with an AI Studio cloud platform can provide efficient computing support for deep learning methods to identify winter wheat on high-resolution images. SUNet adds a BN layer and Shuffle Attention to UNet and uses the focus loss function to train the winter wheat dataset of multi-band GF-2 images with NDVI and NDVIincrease. This can effectively optimize the features. Thus, the precision of the winter wheat recognition on high-resolution images can be improved. The mIou calculated based on the confusion matrix is 0.9514. In addition, the overall classification accuracy, the F1 score of complete accuracy and the kappa coefficient are 0.9781, 0.9663 and 0.9501, respectively. Deep learning has a certain degree of fault tolerance, and unlabeled and mislabeled areas can be identified by training a convolution network. The work presented in this paper still has certain limitations. The dataset used in this study is suitable for extracting spatial information of winter wheat during a specific period. Currently, SUnet exhibits superior performance only in the task of spatial information extraction of winter wheat during specific periods. However, its performance in winter wheat change detection tasks, which involve detecting the variations of winter wheat at different time points, remains unclear. Therefore, further research and evaluation are needed to determine the performance and applicability of SUnet in the task of winter wheat change detection. The next step is to combine the time-series variations of the winter wheat growth cycle and extract the spatial distribution information of winter wheat during different periods.